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[Title of Document] CLAIMS 
[Claim 1] 

A gene consisting of at least one of the following definitions correlated with 
prediction of the postoperative prognosis of breast cancer; 
5 1) a marker gene group capable of establishing classification of genes from breast 

cancer patients died within 5 years after a surgical operation (5y-D group) and genes from 
patients survived free of disease for several years or more after the operation (5y-S group), 
depending on their expression functions, in estrogen receptor-negative breast cancer, 

2) a marker gene group capable of establishing classification of genes from nO breast 
10 cancer patients recurred within 5 years after an operation (5Y-R group) and genes from 

patients sui*vived free of disease for 5 years or more after the operation (5Y-F group), 
depending on their expression functions, in (node-negative)(nO) breast cancer with no 
metastasis to a lymph node in the operation, 

3) a marker gene group capable of establishing classification of genes from breast 
15 cancer patients died within 5 years after a surgical operation (5D group) and genes from 

patients survived free of disease for several years or more after the operation (5S group), 
depending on their expression functions, in primary breast cancer. 
[Claim 2] 

A gene selected from the following sequences correlated with prediction of the 
20 postoperative prognosis of primary breast cancer; 

pro-alpha- 1 type 3 collagen (PIIIP), 

complement component Clr, 

dihydropyrimidinase-like 3 (DPYSL3), 

protein tyrosine kinase 9-like (PTK9L), 
25 carboxypep tidase E (CPE), 

alpha-tubulin, 

beta-tubulin, 

heat shock protein HSP 90-alpha gene, 



malate dehydrogenase, 

NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (NDUFB3). 
[Claim 3] 

A gene selected from the following sequences highly expressed in a group of good 
5 prognosis correlated with prediction of the postoperative prognosis of primary breast cancer; 
pro-alpha- 1 type 3 collagen (PIIIP), 
complement component Clr, 
dihydropyrimidinase-like 3 (DPYSL3), 
protein tyrosine kinase 9-like (PTK9L), 
10 carboxypepndase E (CPE), 
alpha-tubulin, 
beta-tubulin. 
[Claim 4] 

A gene selected from the following sequences highly expressed in a group of bad 
1 5 prognosis correlated with prediction of the postoperative prognosis of primary breast cancer; 
heat shock protein HSP 90-alpha gene, 
malate dehydrogenase, 

NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (NDUFB3). 
[Claim 5] 

20 A gene selected from the following sequences correlated with prediction of the 

postoperative prognosis, in (node-negative)(nO) breast cancer with no metastasis to a lymph 
node in operation; 

AF058701/ DNA polymerase zeta catalytic subunit (REV3), 
AI066764/ lectin, galactoside-binding, soluble, 1 (galectin 1), 
25 xl5940/ ribosomal protein L3 1 ., 

Hs.94653/ neurochondrin (KIAA0607), 
Ml 3436/ ovarian beta-A-inhibin, 

Hs.5002/ copper chaperone for superoxide dismutase; CCS, 



D67025/ proteasome (prosome, macropain) 26S subunit, non-ATPase, 3, 
M80469/ MHC class I HLA-J gene, 
Hs.4864/ ESTs, 
Hs. 106326/ ESTs. 
5 [Claim 6] 

A gene selected from the following sequences highly expressed in a group of bad 
prognosis correlated with prediction of the postoperative prognosis, in (node-negative)(nO) 
breast cancer with no metastasis to a lymph node in operation; 
AF058701/ DNA polymerase zeta catalytic subunit (REV3), 
10 AI066764/ lectin, galactoside-binding, soluble, 1 (galectin 1), 
x 15940/ ribosomal protein L31. 
[Claim 7] 

A gene selected from the following sequences highly expressed in a group of good 
prognosis correlated with prediction of the postoperative prognosis, in (node-negative)(n0) 
15 breast cancer with no metastasis to a lymph node in operation; 
Hs.94653/ neurochondrin (KIAA0607), 
M13436/ ovarian beta-A-inhibin, 

Hs.5002/ copper chaperone for superoxide dismutase; CCS, 
D67025/ proteasome (prosome, macropain) 26S subunit, non-ATPase, 3, 
20 M80469/ MHC class I HLA-J gene, 
Hs.4864/ ESTs, 
Hs. 106326/ ESTs. 
[Claim 8] 

A gene selected from the following sequences correlated with prediction of the 
25 postoperative prognosis, in estrogen receptor-negative breast cancer; 
Hs. 108504/ FLJ20113/ ubiquitin-specific protease otubain 1 
Hs. 146550/ MYH9/ myosin, heavy polypeptide 9, non-muscle 
Hs. 194691/ RAI3/ retinoic acid induced 3 
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Hs. 1975/ TDRD3/ tudor domain containing 3 

Hs.203952/ TRRAP/ transformation/transcription domain-associated protein 
Hs.278607/ GSA7/ ubiquitin activating enzyme El -like protein 
Hs.429/ATP5G3/ 

5 ATP synthase, H+ transporting, mitochondria lFOcomplex, subunitc (subunit9) 

isoform3 

Hs.75305/ AIP/ aryl hydrocarbon receptor interacting protein 

Hs.81170/ PIMl/pim-1 oncogene 

Hs.99987/ ERCC2/ 
10 excision repaircross-complementingrodentrepairdeficiency 

complementationgroup2 

Y12781/ Transducin (beta) like 1 protein 

Hs.104417/ KIAA1205 protein 

cl.21783/ Hypothetical protein 
15 Hs.112628/ Hypothetical protein: MGC43581 

Hs.170345/ Hypothetical protein FLJ13710 

Hs.53996/ weakly similar to zinc finger protein 135 

Hs.55422/ Hypothetical protein 

Hs.112718/ EST 
20 Hs.ll5880/EST 

Hs. 126495/ EST. 

[Claim 9] 

A gene selected from Claim 8, as a gene highly expressed in a group of bad 
prognosis. 
25 [Claim 10] 

A DNA microarray carrying thereon the gene according to any one of Claims 1 to 9. 
[Claim 11] 

The microarray according to Claim 10, wherein the DNA microarray is a fiber type 



microarray. 
[Claim 12] 

A method of inspecting the postoperative prognosis of breast cancer using as a 
marker the gene and/or probe according to any one of Claims 1 to 9. 
5 [Claim 13] 

A method of inspecting the postoperative prognosis of breast cancer using the 
microarray according to Claim 10 or 11. 
[Claim 14] 

A method of screening cancer therapeutic medicines for controlling the postoperative 
10 prognosis of breast cancer using as a marker the gene and/or probe according to any one of 
Claims 1 to 9. 
[Claim 15] 

A method of screening cancer therapeutic medicines for controlling the postoperative 
prognosis of breast cancer using the microarray according to Claim 10 or 11. 
15 [Claim 16] 

A diagnosis kit for the postoperative prognosis of breast cancer containing a reagent 
using as a marker the gene and/or probe according to any one of Claims 1 to 9. 
[Claim 17] 

The diagnosis kit according to Claim 16, wherein the kit comprises a microarray. 
20 [Claim 18] 

The diagnosis kit according to Claim 17, wherein the microarray is a fiber type 
microarray. 
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[Title of Document] SPECIFICATION 

[Title of the: Invention] GENE RELATING TO ESTIMATION OF POSTOPERATIVE 
PROGNOSIS FOR BREAST CANCER 
[TECHNICAL FIELD] 
5 [0001] 

The present invention relates to a gene correlated with prediction of the postoperative 
prognosis of breast cancer. Further, the present invention relates to a method of inspecting 
the postoperative prognosis of breast cancer using this gene, a method of screening cancer 
therapeutic medicines for controlling the postoperative prognosis of breast cancer, and a 
10 diagnosis kit for the postoperative prognosis of breast cancer. 
[BACKGROUND ART] 
[0002] 

Breast cancer is a disease situated as a superior cause for female lethality due to cancer, 
however, there are found still no dominant reasons for determining the grade of malignancy 
1 5 and survival prognosis from the biological standpoint. 
[0003] 

The condition of an estrogen receptor (ER) is one determining element for clinical and 
biological symptoms of human breast cancer. Adjuvant hormone therapeutics is usually 
effective in ER-positive breast cancer patients irrespective of age, condition in the 

20 menopause, correlation with axillary nodes, and tumor diameter. However, ER-negative 
breast cancer is resistance to this therapeutic method (Non Patent Literatures No. 1 and 2). 
Patients having an ER-negative tumor do not necessarily show the same response to 
chemical therapy. Since existent indices cannot classify breast cancer of this type 
depending on clinical symptom, the postoperative prognosis is recognized to be various 

25 (Non Patent Literatures No. 3 and 4). 
[0004] 

Prognosis of breast cancer patients with no lymph node metastasis (node-negative 
breast cancer; nO) is better than that of metastatic breast cancer patients. However, in 

6 



Japan, the present inventors have found that 16% of node-negative breast cancer patients 

replapse within 5 years after the initial operation (Non Patent Literature No. 5). 

[0005] 

Prediction of the postoperative prognosis of breast cancer patients shows increasing in 
5 importance from the standpoint of adjuvant therapeutics currently utilizable. A gene 
marker which is useful in identifying patients showing a possibility of relapsing after an 
operation gives a merit which suitable preoperative adjuvant therapeutics can be applied to a 
high risk patient, and enables prevention of occurrence of unnecessary, complicated and 
uncomfortable side effects. 
10 [0006] 

Conventionally, postoperative procedures for individual patients are determined 
depending on tumor diameter and the stage, metastasis to a lymph node, diagnosis by 
clinicopatho logical factors, search of a hormone receptor, and the like, however, they are not 
critical methods (Non Patent Literatures No. 6-11). 
15 [0007] 

Recently, there is a prognosis marker for postoperative breast cancer patients, 
intending determination of an importance of mutations of genes. These gene mutations 
include a mutation of p53 (Non Patent Literature No. 12), loss of heterozygosity in several 
alleles (Non Patent Literature No. 13), and abnormal expressions of a BRCA2 gene (Non 

20 Patent Literature No. 14), WT1 gene (Non Patent Literature No. 15), HER2/neu gene (Non 
Patent Literature No. 16) and Ki-67 gene (Non Patent Literature No. 17). However, these 
would not be recognized as effective prognosis predicting means when taking into 
consideration a fact which a cancer is a disease owing to accumulation of abnormalities of 
multiple genes. 

25 [0008] 

Further, in these years, genome projects in various organisms are being progressed, 
and a lot of genes and their base sequences typically including a human gene are being 
clarified quickly. The function of a gene having a clarified sequence can be checked by 



various methods. As one of the effective methods, known is a gene expression analysis 
method utilizing clarified base sequence information. For example, there are developed 
methods utilizing various nucleic acid-nucleic acid hybridization reactions and various PCR 
reactions as typified by Northern Hybridization, and relations between various genes and 
5 expressions of their organism functions can be checked by these methods. Though the 
number of applicable genes is limited in these methods, there have been developed a 
methodology and a novel analysis method called DNA microarray method (DNA chip 
method) enabling lump expression analysis of multiple genes, for carrying out 
comprehensive and systemic analysis of extremely many genes such as one individual level, 
10 as being clarified recently through genome projects. 
[0009] 

As the DNA microarray, a lot of shapes are known such as that in which DNA 
synthesis is conducted on many discrete cells applying a lithography technology (Patent 
Literature No. 1), that in which cells composed of grooves or holes are formed on a board 
15 and a probe is fixed to the inner wall of the cell (Patent Literatures No. 2 and 3), a 
microarray in which a probe is fixed to a gel such as acrylamide and the like for increasing 
the amount of a probe to be fixed on a chip (Patent Literatures No. 4 and 5), and the like. 
[0010] 

Some of the present inventors previously developed a new type of a microarray and a 
20 method for producing the same (Patent Literatures No. 6-7). 

The above invention provides a chip that can be obtained by fabricating a nucleic acid 
fixed gel retaining fiber array which retains a nucleic acid fixed gel, and cutting this array 
along a direction crossing the fiber axis of the array. The chip can be utilized as a 
two-dimensional high-density nucleic acid fixed array, namely, a microarray. 
25 [0011] 

Recent studies have found that a cDNA microarray technology is effective for 
identification of a novel gene marker for cancer diagnosis. To date, some researchers have 
carried out microarray analysis of breast cancer, however, there is no description about data 
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of breast cancer gene expression property capable of predicting the postoperative prognosis 
of breast cancer (Non Patent Literatures No. 18-24). As one exception, it is shown that a 
specific profile of a lymph node metastasis negative tumor gives a prediction of a short 
interval before progressing to distant metastasis (Non Patent Literature No. 25). 
5 [Patent Literature No. 1] USP 5445934 

[Patent Literature No. 2] Tokkyo KOKAI (unexamined Japanese patent application) No. 
11-108928 

[Patent Literature No. 3] Tokkyo KOKAI No. 2000-78998 
[Patent Literatures No. 4] USP 577072 1 

10 [Patent Literatures No. 5] Tokkyo KOKAI No. 2000-60554 
[Patent Literature No. 6] Tokkyo KOKAI No. 2000-270878 
[Patent Literature No. 7] Tokkyo KOKAI No. 2000-270879 
[Non Patent Literature No . 1 ] J Clin Oncology (200 1) 19,3817-1827 
[Non Patent Literature No. 2] Breast Cancer (2001) 8, 298-304 

15 [Non Patent Literature No. 3] J Natl Inst (1991) 83, 154-155 

[Non Patent Literature No. 4] J Natl Cancer Inst (2000) 93, 979-989 
[Non Patent Literature No. 5] Clin Cancer Res (2000) 6, 3 1 93-3 198 
[Non Patent Literature No. 6] Cancer (1982) 50, 2131-2138 
[Non Patent Literature No. 7] Histopathology (1991) 19, 403-410 

20 [Non Patent Literature No. 8] Int J Cancer (1996) 69, 135-141 

[Non Patent Literature No. 9] Am J Clin Oncol (1997) 20, 546-55 1 

[Non Patent Literature No. 10] Eur J Cancer (2002) 38, 1329-1334 

[Non Patent Literature No. 1 1] Jpn J Cancer Res (2000) 91, 293-300 

[Non Patent Literature No. 12] Breast Cancer Res Treat (2001) 69, 65-68), loss of 

25 heterozygosity in several alleles 

[Non Patent Literature No. 13] Int J Clin Oncol (2001) 6, 6-12 
[Non Patent Literature No. 14] Int J Cancer (2002) 198, 879-882 
[Non Patent Literature No. 1 5] Clin Cancer Res (2002) 8, 1 1 67- 1 1 7 1 



Arch Surg (2000) 135, 1469-1474 

J Pathol (1999) 187, 207-216 

Proc Natl Acad Sci USA (1999) 96, 9212-9217 

Nature (2000) 406, 747-752 

Proc Natl Acad Sci USA (2001) 98, 11462-11467 

Cancer Res (2001) 61, 5979-5984 

Cancer Res (2000) 60, 2232-2238 

Cancer Res (2001) 61, 5168-5178 

Proc Natl Acad Sci USA (2001) 98, 10869-10874 

N Engl J Med (2002) 347, 1999-2009 



[Non Patent Literature No. 16 
[Non Patent Literature No. 17 
[Non Patent Literature No. 18 
[Non Patent Literature No. 19 
5 [Non Patent Literature No. 20 
[Non Patent Literature No. 21 
[Non Patent Literature No. 22 
[Non Patent Literature No. 23 
[Non Patent Literature No. 24 
1 0 [Non Patent Literature No. 25 

[DISCLOSURE OF THE INVENTION] 

[Problem to Be Solved] 

[0012] 

The present invention has an object of providing innovative means for predicting the 
15 postoperative prognosis of breast cancer patients from the standpoint of gene expression, 
based on results obtained by genome-wide and comprehensive analysis on gene expression 
in breast cancer. 
[Means to Solve the Problem] 
[0013] 

20 The present inventors have comprehensively analyzed gene expression of a human 

gene by a DNA microarray and compared gene expression functions of breast cancers in 
various conditions, thereby, establishing a system for predicting the postoperative prognosis 
of breast cancer. 
[0014] 

25 That is, the present invention provides the following genes (groups) (1) to (8). 

(1) A gene consisting of at least one of the following definitions correlated with 
prediction of the postoperative prognosis of breast cancer; 

1) a marker gene group capable of establishing classification of genes from breast 
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cancer patients died within 5 years after a surgical operation (5y-D group) and genes from 
patients survived free of disease for several years or more after the operation (5y-S group), 
depending on their expression functions, in estrogen receptor-negative breast cancer, 

2) a marker gene group capable of establishing classification of genes from nO breast 
5 cancer patients recurred within 5 years after an operation (5Y-R group) and genes from 

patients survived free of disease for 5 years or more after the operation (5Y-F group), 
depending on their expression functions, in (node-negative)(nO) breast cancer with no 
metastasis to a lymph node in the operation, 

3) a marker gene group capable of establishing classification of genes from breast 
10 cancer patients died within 5 years after a surgical operation (5D group) and genes from 

patients survived free of disease for several years or more after the operation (5S group), 

depending on their expression functions, in primary breast cancer. 

(2) A gene selected from the following sequences correlated with prediction of the 

postoperative prognosis of primary breast cancer; 
15 pro-alpha- 1 type 3 collagen (PIIIP), 

complement component Clr, 

dihydropyritnidinase-like 3 (DPYSL3), 

protein tyrosine kinase 9-like (PTK9L), 

carboxypeptidase E (CPE), 
20 alpha-tubulin, 

beta-tubulin, 

heat shock protein HSP 90-alpha gene, 
malate dehydrogenase, 

NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (NDUFB3). 
25 (3) A gene selected from the following sequences highly expressed in a group of good 

prognosis correlated with prediction of the postoperative prognosis of primary breast cancer; 
pro-alpha- 1 type 3 collagen (PIIIP), 
complement component Clr, 
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dihydropyrimidinase-like 3 (DPYSL3), 
protein tyrosine kinase 9- like (PTK9L), 
carboxypeptidase E (CPE), 
alpha-tubulin, 
5 beta- tubulin. 

(4) A gene selected from the following sequences highly expressed in a group of bad 
prognosis correlated with prediction of the postoperative prognosis of primary breast cancer; 
heat shock protein HSP 90-alpha gene, 

malate dehy drogenase, 
10 NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (NDUFB3). 

(5) A gene selected from the following sequences correlated with prediction of the 
postoperative prognosis, in (node-negative)(nO) breast cancer with no metastasis to a lymph 
node in operation; 

AF058701/ DNA polymerase zeta catalytic subunit (REV3), 
15 AI066764/ lectin, galacto side-binding, souble, 1 (galectin 1), 
x 15940/ ribosomal protein L3L, 
Hs.94653/ neurochondrin (KIAA0607), 
M13436/ ovarian beta-A-inhibin, 

Hs.5002/ copper chaperone for superoxide dismutase; CCS, 
20 D67025/ proteasome (prosome, macropain) 26S subunit, non-ATPase, 3, 
M80469/ MHC class I HLA-J gene, 
Hs.4864/ ESTs, 
Hs. 106326/ ESTs. 

(6) A gene selected from the following sequences highly expressed in a group of bad 
25 prognosis correlated with prediction of the postoperative prognosis, in (node-negative)(n0) 

breast cancer with no metastasis to a lymph node in operation; 
AF058701/ DNA polymerase zeta catalytic subunit (REV3), 
AI066764/ lectin, galactoside-binding, soluble, 1 (galectin 1), 
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xl5940/ ribosomal protein L31. 

(7) A gene selected from the following sequences highly expressed in a group of good 
prognosis correlated with prediction of the postoperative prognosis, in (node-negative)(nO) 
breast cancer with no metastasis to a lymph node in operation; 

5 Hs.94653/ neurochondrin (KIAA0607), 
M13436/ ovarian beta-A-inhibin, 

Hs.5002/ copper chaperone for superoxide dismutase; CCS, 
D67025/ proteasome (prosome, macropain) 26S subunit, non-ATPase, 3, 
M80469/ MHC class I HLA-J gene, 
10 Hs.4864/ ESTs, 
Hs. 106326/ ESTs. 

(8) A gene selected from the following sequences correlated with prediction of the 
postoperative prognosis, in estrogen receptor-negative breast cancer; 

Hs. 108504/ FLJ20113/ ubiquitin-specific protease otubain 1 
15 Hs. 146550/ MYH9/ myosin, heavy polypeptide 9, non-muscle 
Hs.194691/ RAI3/ retinoic acid induced 3 
Hs.1975/ TDRD3/ tudor domain containing 3 

Hs.203952/ TRRAP/ transformation/transcription domain-associated protein 
Hs.278607/ GSA7/ ubiquitin activating enzyme El -like protein 
20 Hs.429/ATP5G3/ 

ATP synthase, H+ transporting, mitochondrialFOcomplex, subunitc (subunit9) 

isoform3 

Hs.75305/ AIP/ aryl hydrocarbon receptor interacting protein 
Hs.81170/PIMl/pim-l oncogene 
25 Hs.99987/ E;RCC2/ 

excision repaircross-complementingrodentrepairdeficiency, 
complement ationgroup2 
Y12781/ Transducin (beta) like 1 protein 
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Hs.104417/ KIAA1205 protein 
cl.21783/ Hypothetical protein 
Hs.112628/ Hypothetical protein: MGC43581 
Hs.170345/ Hypothetical protein FLJ13710 
5 Hs. 53996/ weakly similar to zinc finger protein 135 
Hs.55422/ Hypothetical protein 
Hs. 11 27 18/ EST 
Hs. 115880/ EST 
Hs. 126495/ EST 
10 [0015] 

The present invention also provides a gene selected from the above-mentioned (8), as a 
gene highly expressed in a group of bad prognosis. 

Further, the present invention provides a DNA microarray carrying thereon the gene 
according to any one of the above-mentioned (1) to (9), and preferably, the DNA microarray 
15 is a fiber type microarray. 
[0016] 

The above-mentioned gene can be used as a marker gene for postoperative prognosis 
of breast cancer. Further, it can be also used as a marker gene for cancer therapeutic 
medicines for controlling the postoperative prognosis of breast cancer. 
20 [0017] 

The marker gene can be included as a reagent, and can be used as a diagnosis kit. 
The reagent kit includes a DNA microarray carrying thereon a marker gene, preferably, a 
fiber type microarray. 
[Effect of the Invention] 
25 [0018] 

According to the means of the present invention, completely novel breast cancer 
correlated genes have been found and simultaneously, it has been found that these genes are 
correlated deeply with malignant degeneration of breast cancer and finally, exert an 
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influence on the prognosis of breast cancer patients. Further, by establishing a 
mathematical formula for evaluating expression condition of the found gene, a completely 
novel and effective breast cancer postoperative prognosis predicting system has been 
developed. The system of the present invention from the standpoint of gene expression is 
5 believed to be an innovative prognosis predicting system arresting the biological essentiality 
of a cancer, utterly different from conventional prognosis evaluation methods, when taking 
into consideration a fact which a cancer is a disease owing to abnormality of a gene. 

[BEST MODE FOR CARRYING OUT THE INVENTION] 
10 [0019] 

The marker gene group correlated with prediction of the postoperative prognosis of 
breast cancer as one aspect of the present invention is obtained by analysis by cDNA 
microarray of the expression functions of genes from patients manifesting death or recurring 
within 5 years after a surgical operation and patients survived for 5 years or more after the 
15 operation, in estrogen receptor-negative breast cancer, node-negative breast cancer and 
primary breast cancer. 
[0020] 

Specifically, one aspect of the present invention is a gene consisting of at least one of 
the following definitions selected from known sequences correlated with prediction of the 

20 postoperative prognosis of breast cancer; 

1) a marker gene group capable of establishing classification of genes from breast 
cancer patients died within 5 years after a surgical operation (5y-D group) and genes from 
patients survived free of disease for several years or more after the operation (5y-S group), 
depending en their expression functions, in estrogen receptor-negative breast cancer, 

25 2) a marker gene group capable of establishing classification of genes from nO breast 

cancer patients recurred within 5 years after an operation (5Y-R group) and genes from 
patients survived free of disease for 5 years or more after the operation (5Y-F group), 
depending on their expression functions, in (node-negative)(n0) breast cancer with no 
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metastasis to a lymph node in the operation, 

3) a marker gene group capable of establishing classification of genes from breast 
cancer patients died within 5 years after a surgical operation (5D group) and genes from 
patients survived free of disease for several years or more after the operation (5S group), 
5 depending on their expression functions, in primary breast cancer. 
[0021] 

The gene correlated with prediction of the postoperative prognosis of breast cancer of 
the present invention is obtained by evaluating the data of a cDNA microarray using a 
Random-permutation test and a Mann- Whitney test. The present invention presents an 
10 approach more useful at clinical level, by evaluating gene expression functions by a 
combination of a cDNA microarray and a semi-quantitative PCR experiment. 
[0022] 

In the present invention, a gene correlated with prediction of the postoperative 

prognosis of primary breast cancer has been identified by evaluating gene expression 
1 5 functions in breast cancer patients. 

Specif ically, one aspect of the present invention is a gene selected from the following 

sequences selected from known sequences correlated with prediction of the postoperative 

prognosis o f primary breast cancer; 

pro-alpha- 1 type 3 collagen (PIIIP), 
20 complement component Clr, 

dihydropyrimidinase-like 3 (DPYSL3), 

protein tyrosine kinase 9-like (PTK9L), 

carboxypeptidase E (CPE), 

alpha-tubulin, 
25 beta-tubulin, 

heat shock protein HSP 90-alpha gene, 

malate dehydrogenase, 

NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (NDUFB3). 
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[0023] 

Some of the above-mentioned genes are believed to be correlated with proliferation or 
distant metastasis of tumor cells, and for example, a heat shock protein HSP 90-alpha is a 
chaperone Jbr a lot of kinases, and has a possibility of promoting growth of cancer cells 
5 (Neckers, L (2002) Trends Mol Med 8, S55-61). Malate dehydrogenase is an important 
enzyme coixelated with energy accompanying aerobic or anaerobic metabolism, and the 
activity of malate dehydrogenase is correlated with a tumor marker for squamous cell 
carcinoma (Ross, CD., et al. (2000) Otolaryngol Head Neck Surg 122, 195-200). NADH 
dehydrogenase (ubiquinone) 1 beta subcomplex, 3(NDUFB3) belongs to an mitochondorial 
10 electron transport chain , and chromosome abnormality in a region containing NDUFB3 is 
remarkable in a breast cancer cell line MDA-MB-23 1 (Xie, D., et al. (2002) Int J Oncol 21, 
499-507). 
[0024] 

The above-mentioned 10 genes correlated with prediction of the postoperative 
15 prognosis of primary breast cancer show different expressions in a group of good prognosis 

(5S group) and a group of bad prognosis (5Y group), and 7 genes among the 10 genes are 

genes highly expressed in a group of good prognosis (5S group). 

Namely, one aspect of the present invention is a gene selected from the following 

sequences highly expressed in a group of good prognosis selected from known sequences 
20 correlated with prediction of the postoperative prognosis of primary breast cancer; 

pro-alpha- 1 type 3 collagen (PIIIP), 

complement component Clr, 

dihydropyrimidinase-like 3 (DPYSL3), 

protein tyrosine kinase 9-like (PTK9L), 
25 carboxypepridase E (CPE), 

alpha-tubulin, 

beta- tubulin. 

[0025] 
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3 gene s among the 10 genes correlated with prediction of the postoperative prognosis 
of primary breast cancer are genes highly expressed in a group of bad prognosis (5Y group). 
Namely, one aspect of the present invention is a gene selected from the following sequences 
highly expressed in a group of bad prognosis selected from known sequences correlated 
5 with prediction of the postoperative prognosis of primary breast cancer; 
heat shock protein HSP 90-alpha gene, 
malate dehydrogenase, 

NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (NDUFB3). 
[0026] 

10 Here, the prediction index (PI) for primary breast cancer is defined as described below 

and can be used for prediction of the postoperative prognosis of breast cancer. 

Prediction index (PI) = (total of normalized expression ratios of the above-mentioned 7 
genes highly expressed in a group of good prognosis in breast cancer tissue) - (total of 
normalized expression ratios of the above-mentioned 3 genes highly expressed in a group of 

15 bad prognosis in breast cancer tissue) 
[0027] 

In the present invention, gene expression functions in breast cancer patients have been 
evaluated and 10 genes correlated with prediction of the postoperative prognosis of 
node-negative breast cancer have been identified. 
20 Specifically, one aspect of the present invention is a gene selected from the following 

sequences selected from known sequences correlated with prediction of the postoperative 
prognosis, in (node-negative)(n0) breast cancer with no metastasis to a lymph node in 
operation; 

AF058701/ DNA polymerase zeta catalytic subunit (REV3), 
25 AI066764/ lectin, galacto side-binding, soluble, 1 (galectin 1), 
xl 5940/ ribosomal protein L3 1 ., 
Hs.94653/ neurochondrin (KIAA0607), 
Ml 3436/ ovarian beta-A-inhibin, 



Hs.5002/ copper chaperone for superoxide dismutase; CCS, 
D67025/ proteasome (prosome, macropain) 26S subunit, non-ATPase, 3, 
M80469/ MHC class I HLA-J gene, 
Hs.4864/ ESTs, 
5 Hs. 106326/ ESTs. 
[0028] 

The above-mentioned genes correlated with prediction of the postoperative prognosis 
of node-negative breast cancer include genes correlated with proliferation and distant 
metastasis of tumor cells. For example, galectin 1 is an autocrine type cancer repressor for 
10 regulating cell differentiation (AxelH, et al. (2003) Int. J. Cancer, 103: 370-379). Further, 
a gene activating cancer metastasis is included. 
[0029] 

The above-mentioned 10 genes correlated with prediction of the postoperative 
prognosis cf node-negative breast cancer show different expressions in a group of good 

15 prognosis (5Y-F group) and a group of bad prognosis (5Y-R group), and 3 genes among the 
10 genes are genes highly expressed in a group of bad prognosis (5Y-R group). Namely, 
one aspect of the present invention is a gene selected from the following sequences highly 
expressed in a group of bad prognosis selected from known sequences correlated with 
prediction of the postoperative prognosis, in node-negative breast cancer in operation; 

20 AF05870 1/ DNA polymerase zeta catalytic subunit (REV3), 
AI066764/ lectin, galactoside-binding, soluble, 1 (galectin 1), 
xl 5940/ ribosomal protein L3 1 . 
[0030] 

7 genes among the 10 genes correlated with prediction of the postoperative prognosis 
25 of node-negative breast cancer are genes highly expressed in a group of good prognosis 
(5Y-F group). Namely, one aspect of the present invention is a gene selected from the 
following sequences highly expressed in a group of good prognosis selected from known 
sequences correlated with prediction of the postoperative prognosis, in node-negative breast 
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cancer; 

Hs.94653/ neurochondrin (KIAA0607), 
M13436/ ovarian beta-A-inhibin, 

Hs.5002/ copper chaperone for superoxide dismutase; CCS, 
5 D67025/ proteasome (prosome, macropain) 26S subunit, non-ATPase, 3, 
M80469/ MHC class I HLA-J gene, 
Hs.4864/ ESTs, 
Hs. 106326/ ESTs. 
[0031] 

10 Here, the prognosis score (PS) for node -negative breast cancer is defined as described 

below and can be used for prediction of the postoperative prognosis of breast cancer. 

Prognosis score (PS) = (total of normalized expression ratios of the above-mentioned 3 
genes highly expressed in a group of bad prognosis in breast cancer tissue) - (total of 
normalized expression ratios of the above-mentioned 7 genes highly expressed in a group of 

15 good prognosis in breast cancer tissue). 
[0032] 

In the present invention, 20 genes correlated with prediction of the postoperative 

prognosis of estrogen receptor-negative breast cancer have been identified, by evaluating 

gene expression functions in breast cancer patients. 
20 Specifically, one aspect of the present invention is a gene selected from the following 

sequences selected from known sequences correlated with prediction of the postoperative 

prognosis, in estrogen receptor-negative breast cancer; 

Hs. 108504/ FLJ20113/ ubiquitin-specific protease otubain 1 

Hs. 146550/ MYH9/ myosin, heavy polypeptide 9, non- muscle 
25 Hs. 194691/ RAI3/ retinoic acid induced 3 

Hs.1975/ TDRD3/ tudor domain containing 3 

Hs.203952/ TRRAP/ transformation/transcription domain-associated protein 
Hs.278607/ GSA7/ ubiquitin activating enzyme El -like protein 
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Hs.429/ ATP5G3/ ATP synthase, H+ transporting, 

mitochondr ia lFOcomplex, subunitc (subunit9) isoform3 

Hs.75305/ MP/ aryl hydrocarbon receptor interacting protein 

Hs.81170/ PIMl/pim-1 oncogene 
5 Hs.99987/ ERCC2/ 

excisionrepakcross-complementingrodentrepairdeficiency,complementationgroup2 

Y 12781/ Transducin (beta) like 1 protein 

Hs.104417/ KIAA1205 protein 

cl.21783/ Hypothetical protein 
10 Hs. 1 12628/ Hypothetical protein: MGC4358 1 

Hs. 170345/ Hypothetical protein FLJ13710 

Hs.53996/ weakly similar to zinc finger protein 135 

Hs.55422/ Hypothetical protein 

Hs.112718/ EST 
15 Hs. 11 5880/ EST 

Hs. 126495/ EST 

[0033] 

The above-mentioned genes correlated with prediction of the postoperative prognosis 
of estrogen receptor-negative breast cancer include genes correlated with proliferation and 

20 distant metastasis of tumor cells. For example, PIM1 is serine/threonine kinase, and there 
is a correlation between clinical results of prostate cancer and the expression (Oesterreich, 
S., et al. (1996) Clin Cancer Res, 2, 1199-1206). TRRAP protein is a subunit of a mammal 
HTA complex, and antisense RNA against TRRAP inhibits estrogen-dependent growth of 
breast cancer cells. 

25 [0034] 

The above-mentioned 20 genes correlated with prediction of the postoperative 
prognosis of estrogen receptor-negative breast cancer show high expression in a group of 
bad prognosis (5y-D group). Namely, one aspect of the present invention is a gene 

21 



selected from known sequences correlated with prediction of the postoperative prognosis, in 
the above-mentioned estrogen receptor-negative breast cancer highly expressed in a group 
of bad prognosis. 
[0035] 

5 Here, postoperative prognosis of breast cancer can be predicted as described below, 

based on the expression of the above-mentioned gene correlated with prediction of the 
postoperative prognosis of estrogen receptor-negative breast cancer; 

(1) when the expression levels in breast cancer tissue of the above-mentioned 20 genes 
correlated with prediction of the postoperative prognosis of estrogen receptor-negative 

10 breast cancer are compared with the average value in a parent population, and if the 
expression level of each gene is 2-fold or more of the average value in a parent population, 
one point is imparted, 

(2) when the procedure of (1) is carried out on 20 genes, and if the total point is 8 
points or more, prognosis is decided to be bad. 

15 [0036] 

The above-mentioned gene correlated with prediction of the postoperative prognosis of 
breast cancer can be used as a marker for inspection of breast cancer postoperative 
prognosis. Namely, one aspect of the present invention is a method of inspecting the 
postoperative prognosis of breast cancer using the above-mentioned gene as a marker. 
20 [0037] 

The above-mentioned gene correlated with prediction of the postoperative prognosis of 
breast cancer can be used as a marker for screening of cancer therapeutic medicines for 
controlling the postoperative prognosis of breast cancer. Namely, one aspect of the present 
invention is a method of screening cancer therapeutic medicines for controlling the 
25 postoperative prognosis of breast cancer using the above-mentioned gene as a marker. 
[0038] 

The above-mentioned gene correlated with prediction of the postoperative prognosis of 
breast cancer can be used as a marker for diagnosis of the postoperative prognosis of breast 
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cancer. It is also possible to design probes specific to the above-mentioned gene and to use 
these probes as a marker These probes can be designed, for example, by Probe Quest 
(registered trademark) manufactured by Dyna Com. Namely, one aspect of the present 
invention is a diagnosis kit for the postoperative prognosis of breast cancer containing a 
5 reagent using the above-mentioned gene as a marker. 
[0039] 

The above-mentioned diagnosis kit can include a microarray. Namely, one aspect of 
the present invention is the diagnosis kit, wherein the diagnosis kit includes a microarray. 
[0040] 

10 The microarray of the above-mentioned diagnosis kit including a microarray includes 

a fiber type: microarray. Here, for a method of preparing a fiber type microarray, the 
above-mentioned Patent Literatures 6-7 are cited. Namely, one aspect of the present 
invention is the above-mentioned diagnosis kit wherein the microarray is a fiber type 
microarray. 

15 [0041] 

Next, aspects of the present invention will be specifically illustrated by examples, but 
the present invention is not limited to these examples. 
[Example 1] 
[0042] 

20 Evaluation of gene expression function for prediction of the postoperative prognosis in 

estrogen receptor-negative breast cancer 
[0043] 

(Tissue sample) 

An informed consent was obtained according to a guide line accepted by an ethics 
25 committee of Cancer Society and by Nippon Medical School, then, primary breast cancer 
and tissue from adjacent normal mammary gland were collected from breast cancer patients 
who undergone an operation in 1995 to 1997 in Cancer Society attached hospital (Tokyo). 
The tissue was quickly frozen and preserved at -80°C. For 954 patients, all members were 
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clinically traced during a period of 5 years or more or until death, and samples were selected 
from 10 estrogen receptor-negative breast cancer patients died within 5 years after the 
operation (:5y-D) and 10 patients survived free of disease for 5 years or more after the 
operation (f>y-S). The backgrounds of both the patient groups were allowed to coincide in 
5 age, lymph node metastasis, tumor diameter and tissue type (Table 1). 
[0044] 

(Clinical feature of 20 cases of breast cancer) 

[0045] 

[Table 1] 



group 








Sex 


Process a 


TNM 


classification b 


TTD c 


<^ase ino. hk condition Age 


Tumor 


Lymph node 




3281 


Neeat ive 


34 


Female 


a2 


T2 


N1b 


9 




3459 


Neeat ive 


64 


Female 


a2 


T4 


N3 


6 




3550 


Neeat ive 


73 


Female 


a2 


T4 


N1b 


12 




3892 


Neeat ive 


62 


Female 


a2 


T2 


Nla 


21 


5y-D 


3948 


Neeat ive 


60 


Female 


a2 


12 


N1a 


51 


4020 


Neeat ive 


50 


Female 


a2 


12 


N3 


28 




3654 


Neeat ive 


46 


Female 


a2 


T4 


Mb 


19 




4113 


Neeat ive 


53 


Female 


a2 


T1 


N1a 


21 




4462 


Neeat ive 


34 


Female 


a1 


12 


Nla 


24 




4126 




51 




K 


T4 


N3 


6 




3656 


Neeat ive 


31 


Female 


a2 


12 


Nla 


>60 




3197 


Neeat ive 


42 


Female 


a1 


TI 


N1a 


>60 




3662 


Neeat ive 


58 


Female 


a2 


T2 


NO 


>60 




3241 


Neeat ive 


47 


Female 


a2 


12 


N1a 


>60 




3267 
3329 


Neeat ive 
Neeat ive 


51 
60 


Female 
Female 


a2 
a2 


12 
12 


N1a 
N1a 


>60 
>60 




3345 


Neeat ive 


43 


Female 


a1 


12 


N2 


>60 




3556 


Neeat ive 


59 


Female 


a2 


13 


NO 


>60 




3558 


Neeat ive 


57 


Female 


a2 


T3 


Mb 


>60 




3658 




42 




- al 


T2 


Ml« 


m 



Vl: invasive oaoillotubular carcinoma. a2: invasive so I id- tubular carcinoma. b5: sauamouscell carcinoma. 



b TNM classification: clinical classification by Japan Breast Cancer Society 
c TTD: time to death after surgery (months) 

10 

[0046] 

All patients underwent postoperative adjuvant therapy according to "Postoperative 
clinical protocol for breast cancer (nyugan no tameno shujutsugo no rinsho no purotokoru)" 
of Cancer Society attached hospital. In each case, selection of adjuvant therapy was 
15 determined strictly based on surgical operation type, lymph node involvement condition, 
and presence of local or distant metastasis. In the study of the present invention, all 
patients did not have distant metastasis before the adjuvant chemical therapy and did not 



undergo rad iation therapy or chemical therapy before the surgical operation. 
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[0047] 

(Clinicopathological parameter) 

The following parameters were checked: tissue type, tumor diameter and invasion (t 
factor), lymph node involvement, and conditions of estrogen receptor (ER) and progesterone 
5 receptor (PgR). Tumors were classified into the following types according to TNM 
classification and to tissue classification of Japan Breast Cancer Society ( 1 989); 
noninvasivetubular (la), invasivepapillo tubular (al), invasive solid-tubular (a2), 
invasivescirrhouscarcinoma (a3), and other special types (b). The classification is 
basically the same as breast cancer tissue classification of WHO. t factors were classified 
10 into the following types according to histologial TNM classification; tumor with a 
maximum size of 2 cm or less (tl), tumor with no invasion into skin or pectoral muscle and 
with a maximum size of 2 cm or more (t2), and tumor with invasion into skin or pectoral 
muscle (t3). 
[0048] 

1 5 (Design and construction o f cDN A micro array) 

From 25344 cDNAs selected from UniGene database, "genome wide cDNA 
microarray" was constructed. The cDNAs were made by RT-PCR using poly(A)+RNAs 
separated from various human organs. The PCR products were spotted on slide glasses of 
type 7 (Amersham Biosciences UK Limited. Buckinghamshire, UK) using Array Spotter 

20 Generation III (Amersham Biosciences). Each slide contains 384 house-keeping genes. 
[0049] 

(Preparation and proliferation of RNA) 

A tumor raw material was quickly frozen at -80°C immediately after collection. RNA 
was extracted using TRIzol (Invitrogen Inc., Carlsbad, CA, USA), further, purified using 
25 RNeasykits (Quiagen Inc., Valencia, CA). The purity of each RNA was evaluated by a 
spectrophotometry and electrophoresis on 1.2% modified formamide gel. The high purity 
RNA was defined as a sample having an absorbance ratio (260 nm/280 nm) of 1.8 to 2.0 and 
in which 28S/18S liposomal bands show a ratio of 1.8 or more on formamide gel 
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electrophoresis. After treating with 1 unit of DNasel (Epicentre Technoloies, Madison, 
WI)(1 unit/|al), RNA amplification by T7RNA polymerase was carried out using 2 \xg of 
RNA from each sample as a starting raw material. Amplification was carried out twice, 
and the amplified RNA (aRNA) was purified by RNeasykits (Quiagen Inc., Valencia, CA). 
5 The amount of each aRNA was measured by a spectrophotometer, and the quality was 
checked by formamide gel electrophoresis. 
[0050] 

(Labeling of aRNA, hybridization and scanning) 

cDNA for microarray analysis was prepared from aRNA. aRNAs (5 to 10 (ig) from 

10 breast cancer and normal mammary gland tissue were labeled with Cy5 (cancer sample) and 
Cy3 (normal sample) using aminoallyl-cDNA labelingkits (Ambion, Austin, TX). The 
Cy3-labeled cDNA probe and the Cy5-labeled cDNA probe were mixed and heated at 95°C 
for 5 minutes, then, quenched with ice for 30 seconds, and hybridized on a microarray. 
The mixed probes were added to formamide (Sigma- Aldrich Corp., St. Louis, MO, USA) 

15 having a 50% final concentration of micro arrayhybridization solution version 2 (Amersham 
Biosciences UK Limited, Buckinghamshire, UK). After hybridization at 40°C for 15 hours, 
the microarray slides were washed first with IxSSC and 0.2% SDS at 55°C for 10 minutes, 
then, washed twice with O.lxSSC/0.2% SDS each for 1 minute at room temperature. All 
treatments were carried out by Automated Slide Processor System (Amersham). The 

20 signal strength of each hybridization was seamed by Gene Pix 4000A (Axon Instruments, 
Inc., Foster City, CA, USA), and evaluated by Gene Pix 3.0 (Axon Instruments) by a 
spectrophotometry. The scanned signals were normalized by a method described in the 
following literature (the total gene normalization method) (Yang YH, Dudoit S, Luu P, et al. 
(2002) Nucleic Acids Res 30, el5; Manos EJ, Jones DA. (2001) Cancer Res 61: 433-348). 

25 [0051] 

(Signal analysis and selection of genes showing different expressions) 

The signal strength of each hybridization was evaluated by a photometry by Gene Pix 
3.0 (Axon Instruments, Inc., Foster City, CA, USA). For normalizing mRNA expression 
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levels between cancer and control, the Cy5:Cy3 ratio in each gene expression was adjusted. 
As a result, the averaged Log (Cy5:Cy3 ratio) of the house keeping genes was zero. 27 
house keeping genes were adopted from a house-keeping panel in Web site 
http://wwwjihgri.nih.gov/DIRyLCG/ARRAY/expn.html. For each microarray slide, the 
5 cut off value of (S/N) ratio was set at 3.0. Genes with signal strengths of Cy3 and Cy5 
lower than the cut off value were excluded out of the investigation. 
[0052] 

(Mann- Whitney test) 

For investigating genes showing apparently different expressions between 5y-D tumor 

10 and 5y-S tumor, Mann- Whitney test was applied to a series of samples X. X represents 
Cy5/Cy3 signal strength ratio of each gene and each sample (OnoK, Tanaka T, Tsunoda T, et 
al. (2000) Cancer Res 2000; 60: 5007-5011). The U value was calculated for genes 
imparting significant signals in at least 5 samples in both groups. Genes showing U values 
of lower than 23 or larger than 77 were selected. Since the U value is obtained by 

15 calculation for 5y-S group based on 5y-D group in each gene based on each X value, U 
values lower than 23 were evaluated to manifest higher expression in 5y-S group than in 
5y-D group. However, genes with U values higher than 77 were evaluated to manifest 
higher expression in 5y-D group than in 5y-S group. Base on this criterion, 183 genes 
were highly expressed in 5y-S group and 31 genes were highly expressed in 5y-S group. 

20 Thus, only genes in which intermediate expression values show a difference of 2-fold or 
more between two groups (|iXD/(iXS < 0.5 or > 2.0, joXD and jaXS represent average X 
values in 5y-D and 5y-S group, respectively) were defined as genes correlated with 
prognosis. As a result, 110 genes in total were selected. Of them, 90 genes were 
expressed at higher level in 5y-D tumor group and 20 genes were expressed at higher level 

25 in 5y-S tumor group. 
[0053] 

(Random-permutation test) 

Further, for evaluating values of 110 genes selected by the Mann- Whitney test, a 

27 



permutation test was carried out. A possibility, Ps of a gene for correlating with a group 
difference was also assumed. When each gene is represented by an expression vector v(g) 
= (XI, X2, — , X20) (Xi shows a gene expression level of i-th sample in the first sample 
group), an ideal expression pattern is expressed by c = (cl, c2, — , c20) (ci = +1 or 0, 
depending on whether i-th sample belongs to S group or D group). 

Correlation between a gene and a group difference Pgc was defined as described below. 
That is, Pgc = Os-|^d)/(5s+S d ); Hs(Hd) and 5s(5 D ) show standard deviation of log 2 X of the 
gene "g" of each sample in a newly defined S (or D) group. 

The permutation test was carried out while substituting the coordinate of c. The 
correlation values, Pgc were calculated between all permutations. These procedures were 
repeated for 10000 times. Accidentally, the p value showing a possibility of a gene for 
classifying two groups was evaluated for each of 110 genes selected. Finally, 71 gene 
highly expressed in 5y-D case and 15 gene expressed low in 5y-S case were selected. 
[0054] 

(Semi-quantitative RT-PCR) 

RNA (2 |iig) was treated with DNase I (Epicentre Technologies, Madison, WI, USA), 
and single- stranded cDNAs were subjected to reverse transcription using Reverscript 
Ilreversetranscriptase (manufactured by Wako Pure Chemical Industries, Ltd., Osaka, 
Japan) and oligo (dT) 12-18 primer. Single- stranded cDNAs were adjusted in the 
concentration for the subsequent PCR amplification by monitoring expression of GAPD 
(glyceraldehyde-3-phosphatedehydrogenase) as a quantitative control. Each PCR was 
carried out under the following reaction conditions using Gene Amp PCR system 9700 
(Applied Biosystems, Foster City, CA, USA) at an amount of lxPCR buffer of 30 |il. 

94°C 5 minutes, 

(94°C 30 seconds, 60°C 30 seconds, and 72°C 30 seconds) for 25 to 35 cycles. 

Primer sequences used in RT-PCR are as described below: 
SEQ ID No. 1 GAPD (control) forward, 5'-GGA AGGTGA AGG TCG GAG T-3' 
SEQ ID No. 2 reverse, 5'-TGG GTG GAA TCA TAT TGGAA-3'; 
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SEQ ID No 3 Hs.l08504F, 5'-ACA CTT CAT CTG CTCCCT CAT AG-3'; 

SEQ ID No 4 Hs.l08504R, 5'CTG CCT AGA CCT GAGGAC TGT AG-3'; 

SEQ ID No. 5 Hs.l46550F, 5' ACT GAG GCC TTT TGGTAG TCG-3'; 

SEQ ID No. 6 Hs.l46550R, 5'TCT CTT TAT TGT GATGCT CAG TGG-3'; 
5 SEQ ID No 7 Hs.76607F, 5 'AAA TCC TTC TCG TGT GTTGAC TG-3'; 

SEQ ID No 8 Hs.76607R, 5'CAG TCA TGA GGG CTA AAAACT GA-3'; 

SEQ ID No 9 Hs.l975F, 5'GAA GAC AAC AAG TTT TAC CGG G-3'; 

SEQ ID No. 10 HS.1975R, 5'ATG GTT TTA TTG ACG GCAGAA G-3'; 

SEQ ID No. 11 Hs.203952F, 5'AGG ACA CGT CCT CTCCTC TCT C-3'; 
10 SEQ ID No. 12 Hs.203952R, 5'TAA AGC TAG CGA AGGAAC GTA CA-3'; 

SEQ ID No. 13 Hs.278607F, 5'TCC CTT CTG TTT CCT CAG TGT T-3'; 

SEQ ID No. 14 Hs.278607R, 5'CCT GCC CCG ATA AAA ATA TCT AC -3'; 

SEQ ID No. 15 Hs.429F, 5 'TTG ACC TTA AGC CTC TTTTCC TC-3'; 

SEQ ID No. 16 Hs.429R, 5 'ATA ACG TAC ATT CCC ATGACA CC-3'; 
15 SEQ ID No. 17 Hs.75305F, 5'ACT TTC AAG ATG GGACCA AGG-3'; 

SEQ ID No. 18 Hs.75305R, 5 'ATA TAC ACA GAA GCATGA CGC AG-3'; 

SEQ ID No. 19 Hs.81 170F, 5 'TTG CTG GAC TCT GAAATA TCC C-3'; 

SEQ ID No. 20 Hs.81 170R, 5'TTC CCC TGT ACA GTATTT CAC TCA-3'; 

SEQ ID No. 21 Hs.99987F, 5'CTG AGC AAT CTG CTCTAT CCT CT-3'; 
20 SEQ ID No. 22 Hs.99987R, 5 'GTT CCA GAT TCG TGAGAA TGA CT-3'; 

SEQ ID No. 23 Y12781F, 5'ACC AGT AAC AAC TGT GGGATG G-3'; 

SEQ ID No. 24 Y12781R, 5'CAA ATG AGC TAC AAC ACACAA GG-3'; 

SEQ ID No. 25 Hs.l04417F, 5'CCC CCT CCA CCT TGTACA TAA T-3'; 

SEQ ID No. 26 Hs.l04417R, 5 'GTT TTC GTT TGG CTGGTT GTG-3'; 
25 SEQ ID No. 27 cl.21783F, 5'GTC TGA GAT TTT ACTGCA CCG-3'; 

SEQ ID No. 28 cl.21783R, 5'GGA TGG AGC TGG AGGATA TTA-3'; 

SEQ ID No. 29 Hs.l 12628F, 5 'ATT GCT AAG GAT AAGTGC TGC TC-3'; 

SEQ ID No. 30 Hs.l 12628R, 5 'TGT CAG TAT AGA AGC CTG TGG GT-3'; 
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SEQ ID No 31 Hs.l70345F, 5'TTC TTAGGC CAT CCCTTT TCT AC-3 ' ; 
SEQ ID No 32 Hs.l70345R, 5'GCA TCT GAA TGT CTTTCT CCC TA-3 5 ; 
SEQ ID No 33 Hs.53996F, 5 5 CCA TAG GAT CTT GACTCC AAC AG-3 5 ; 
SEQ ID No 34 Hs.53996R, 5'ACT GGG AGT GGA GGAAAT TAG AG-3 5 ; 
5 SEQ ID No 35 Hs.55422F, 5'CTA ATG TAA GCT CCATTG GGA TG-3'; 
SEQ ID No 36 Hs.55422R, 5'CAA ACT GCA AAC TAGCTC CCT AA-3 5 ; 
SEQ ID No 37 Hs.l 12718F, 5'AAG ACT AAG AGG GAA AAT GTG GG-3'; 
SEQ ID No 38 Hs.l 12718R, 5 'AGG TAA CCC AAA GTG ACA AAC CT-3 5 ; 
SEQ ID No 39 Hs.l 15880F, 5'TTA AGT GAG TCT CCT TGG CTG AG-3 5 ; 
10 SEQ ID No 40 Hs.l 15880R, 5' AGG GCC CCT ATA TCC AAT ACC TA-3'; 
SEQ ID No 41 Hs.l26495F, 5'GAT CTT TCA AGA TGAGCC AAG GT-3 5 ; 
SEQ ID No 42 Hs.l26495R, 5 5 AGT CAT TCA GAA GCCATT GAG AC-3 5 
[0055] 

(Measurement of signal strength of RT-PCR product and calculation of prognosis score) 
15 A PCR product was detected by 2% agarose gel electrophoresis and ethidium bromide 

staining. A gel was scanned by a digital image processing system (Alphalmager 3300; 
Alpha Innotech, San Leandro, CA, USA) according to the Spot Density method. A 
two-dimens ional region of each band was constructed, and pixel strength (gene expression) 
was obtained in which the density was defined as IDV (Integrated Density Value). 
20 Importance in a difference in IDV in each group was evaluated by the Student's t-test. As a 
result, 20 genes showing p values of 0.05 or lower in the t-test were selected as a candidate 
(Table 2). That is, expression levels of the 20 genes were significantly higher in the 5y-D 
group than in the 5y-S group. Base on this information, the present inventors have tried to 
establish a scoring system for predicting the postoperative prognosis. In this procedure, 
25 each gene v/as determined depending on whether the expression level of each sample was 
higher than the average expression level of 20 samples or not. When the expression level 
of a sample was 2-fold or more than the average, + 1 point was imparted additionally. 
Next, points of all of the 20 genes were summed up for obtaining the total vote (prognosis 
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score) for each sample. As a result, a case of a sample of 8 points or more was evaluated 
as an indication of bad prognosis. On the other hand, a case of a sample of 8 points or less 
was evaluated as an indication of preferable prognosis. 
[0056] 
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(20 candidate genes of prognosis scoring system) 

[0057] 

[Table 2] 



Hs./Accesion No. kind 

H<;. 103504 FLJ20113: ubiquitin-specific protease otubain 1 

H<>. 146560 MYH9: myosin, heavy polypeptide 9, non-muscle 

Hs>. 194691 RAI3: retinoicadd induced 3 

Hs.1975 TD RD3: tudor domain containing 3 

Hs.203952 TR RAP: transformationrtranscription domain-associated protein 

Hs.278607 GSA7: ubiquitin activating enzyme E1-like protein 

Hs.429 ATP5G3: ATP synthase, H+ transporting, mitochondrial P0 complex, subunit c (subunit 9) isoforrn 3 

Hs.75305 AIP: aryl hydrocarbon receptor interacting protein 

Hs.81170 PIM1: pim-1 oncogene 

Hs.99987 ERCC2: excision repair cross-complementing rodent repair deficiency, complementation group 2 

Y '12781 Transducin(beta)Hke1 protein 

H<;. 104417 KIAA1205 protein 

cl .21783 Hypothetical protein 

Hs>. 112628 Hypothetical protein: MGC43581 

H*. 170345 Hypothetical protein FLJ13710 

Hs .53396 weakl y si mi lar to zi nc f i nger protei n 1 35 

Hs;.55422 Hypothetical protein 

Hs;. 112718 EST 

H<;. 115880 EST 

Hs. 126495 EST 



5 [0058] 
(Result) 

257 genes highly expressed significantly in estrogen receptor-negative breast cancer 
tissue were clarified, and 378 genes expressed low were clarified likewise. For identifying 
genes showing different expressions between the 5y-D group and the 5y-S group, the data of 
10 a microarray was analyzed by the Mann- Whitney test and the Random-permutation test. 
As a result, 71 genes in total (including 10 EST and 9 genes encoding virtual protein) in 
5y-D tumor were classified in common into a group of higher expression. In contrast, 15 
genes (including 3 EST) were classified in common into a group of lower expression (Fig. 
1). 

15 [0059] 

Genes highly expressed in the 5y-D group include the following genes correlated with 
proliferation and metastasis of cancer cells; matrix metalloproteinase 2 (MMP2), heat shock 
protein 27 HSPB1), Pim-1 oncogene (PIM1) and transformation/transcription 
domain-associated protein (TRRAP). 
20 Genes expressed low in the 5y-D group include genes of HLA-C (major 

histocompatibility complex, class I, C) and specific kinase. A lot of genes having 

32 



correlations with DNA repair, transcription, signal transduction, cytoskeleton and 

adhesiveness showed different expressions between two groups. 

[0060] 

For confirming reliability of the data of a microarray, 20 genes highly expressed in the 
5 5y-D group were selected (Hs.108504, Hs.146550, Hs.194691, Hs.1975, Hs.203952, 
Hs.278607, Hs.429, Hs.75305, Hs.81170, Hs.99987, Y12781, Hs.104417, cl.21783, 
Hs.112628, Hs.170345, Hs.53996, Hs.55422, Hs.112718, Hs.115880, and Hs.126495), and 
the expression levels of the genes were checked by semi-quantitative RT-PCR. The result 
coincided with the data of a microarray, and had a statistical significance for classifying the 
10 5y-D group and the 5y-S group (typical data is shown in Fig. 2). 
[0061] 

For constructing a scoring system for predicting the postoperative prognosis using the 
expression profile of a marker gene, prognosis score was calculated by the above-mentioned 
method. B riefly, a marker gene was selected according to the following standard. 
15 (1) Higher signal strength than cut off level is shown in at least 60% of cases checked; 

(2) |hd-|Is| i s 10 or ^ss. Here, Hd(ms) shows an average value derived from 
logarithm converted relative expression ratio in the case of 5y-D(5y-S). 
[0062] 

Next, for identifying a marker gene capable of classifying the 5y-D group and the 5y-S 
20 group depending on the expression function, the Mann- Whitney test and the 
Random-permutation test were carried out. The result of a microarray correlated was 
confirmed by a semi-quantitative RT-PCR experiment. By the Student's t-test, 20 genes 
were selected as a prognosis marker (Table 2). 
[0063] 

25 Depending on the prognosis score (PS) of the present invention, 20 patients were 

divided into 10 members predicted to show poor prognosis (PS is 11 or more) and 10 
members predicted to show excellent prognosis (PS is less than 11). As a result, it was 
shown by comparison with the postoperative progress which the scoring system of the 
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present invention has reliability with an accuracy of 80% in the 5y-D case and with an 

accuracy of 100% in the 5y-S case (Fig. 3 A). 

[0064] 

Using the prognosis scoring system of the present invention, additional 5 cases were 
5 checked (Fig. 3B). The system predicted poor prognosis in 2 cases (PS > 11; patient TD-1 
and patient TD-2), and excellent prognosis in 3 cases (PS < 11; patients TD-3, TS-1 and 
TS-2). As a result, this scoring system showed an accuracy of 80% regarding actual 
clinical results of these 5 cases. 
[Example 2] 
10 [0065] 

Evaluation of gene expression function for prediction of the postoperative prognosis in 
node-negative breast cancer 
[0066] 

(Tissue sample) 

15 A tissue sample was collected in the same manner as described in Example 1. Gene 

expression was investigated for tumors from 12 patients of node-negative (nO) cancer 
showed recurrence within 5 years after an operation (5Y-R) and 12 patients survived free of 
disease for 5 years or more after the operation (5Y-F). The clinical backgrounds of both 
the patient groups were allowed to coincide in age, lymph node metastasis, tumor diameter, 

20 condition of hormone receptor, and pathological tissue (Table 3). The follow up 
intermediate period was 7.8 years, and the average period between the initial operation and 
recurring was 2.7 years in the 5Y-R group. All patients underwent the adjuvant therapy 
described in Example 1 . 
[0067] 

25 



34 



(Clinical pathological data) 

[0068] 

[Table 3] 



Case 


Age 


Climacteric 
cc ndition 


Histological 
classification a 


.Position 


Diameter 
(mm) 


T 


TNM classification ^ 
N 


M 


Stage 


ER(+/~) 


c 

PgR(+/-) D.F.I. 


R-1 


55 


Post. 


a2 


Rt. 


25 


2 


1a 


0 


II 


+ 


_ 


12m 


R-2 


50 


Pre. 


a3 


Lt 


25 


2 


1a 


0 


II 


+ 


+ 


16m 


R-G 


42 


Pre. 


a2 


Rt 


25 


2 


0 


0 


II 


+ 


+ 


49 m 


R-4 


39 


Pre. 


a3 


Rt. 


35 


2 


0 


0 


II 


+ 


_ 


20m 


R-5 


38 


Pre. 


a2 


Lt. 


30 


2 


0 


0 


II 


+ 


+ 


52m 


R-6 


61 


Post 


a3 


Lt. 


34 


2 


0 


0 


II 


- 




14m 


R-7 


54 


Post. 


b3 


Lt. 


30 


2 


0 


0 


II 


- 




24m 


R-8 


37 


Pre. 


a2 


Rt. 


23 


2 


0 


0 


II 






25m 


R-9 


54 


Post. 


a3 


Lt 


25 


2 


1a 


0 


II 


+ 


+ 


47m 


R-10 


83 


Post. 


a2 


Rt. 


28 


2 


1a 


0 


II 


+ 


+ 


38m 


R-1 1 


62 


Post. 


a2 


Lt. 


23 


2 


0 


0 


II 




+ 


40m 


R-1 2 


50 


Post. 


a3 


Lt. 


35 


2 


0 


0 


II 






25m 


F-1 


48 


Pre. 


a2 


Lt. 


18 


2 


0 


0 


II 


+ 


+ 


8Y 


F-2 


62 


Post. 


a2 


Rt. 


25 


2 


0 


0 


II 


+ 




8Y 


F-3 


57 


Post. 


a1 


Rt. 


20 


1 


0 


0 


I 


+ 


+ 


7Y10m 


F-4 


61 


Post. 


a2 


Lt. 


30 


2 


1a 


0 


II 






7Y2m 


F-5 


42 


Pre. 


a1 


Lt 


12 


1 


1a 


0 


I 




+ 


7Y11m 


F-6 


51 


Pre. 


a2 


Rt. 


28 


2 


1a 


0 


II 






7Y10m 


F-7 


59 


Post. 


a2 


Rt. 


40 


3 


0 


0 


II 






7Y5m 


F-8 


57 


Post* 


a2 


Rt. 


45 


3 


1b 


0 


II 






7Y5m 


F-9 


42 


Pre. 


a1 


Lt 


48 


2 


1a 


0 


II 




+ 


7Y3m 


F-10 


58 


Post 


a2 


Lt. 


13 


2 


0 


0 


II 






7Y3m 


F-1 1 


50 


Post 


a2 


Lt. 


25 


2 


0 


0 


II 


+ 


+ 


7Y8m 


F-1 2 


55 


Post 


a1 


Rt. 


35 


2 


0 


0 


II 


+ 


+ 


7Y5m 



a al : invasive papillotubular carcinoma, a2 : invasivesolid-tubularcarcinoma, a3 : 
invasive schirrhous carcinoma 

b TNM classification: clinically classified according to TNM classification by Japan 
Breast Cancer Society 

c D.F.I. : period of no pathogeny (disease free interval) 
[0069] 

(Clinicopathological parameter) 

The clinicopathological parameter was checked by the method described in Example 1. 
The histological grade was evaluated by a method of Elastonand Ellis (Abrams JS. Breast 
Cancer 2001; 8: 298-304). Lymphoduct invasion was evaluated to be deficient or positive 
(for example, evaluated to be positive when one or more cancer cells are present in 
lymphoducts around cancer). Fatinvasion was evaluated to be deficient or positive (for 
example, evaluated to be positive in the case of invasion into interstitial tissue). 
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[0070] 

(Preparation of cDNA microarray) 

"Genome wide cDNA microarray kit (Amersham Biosciences UK Limited, 
Buckinghamshire, UK)" with 25344 cDNAs was used. The PCR product was stopped on 
5 type 7 glass slides (Amersham Biosciences) using Array Spotter Generation III (Amersham 
Biosciences). 
[0071] 

(Preparation and proliferation of RNA) 

Preparation and proliferation of RNA were carried out in the same method as 
10 described in Example 1. 
[0072] 

(Labeling of aRNA, hybridization and scanning) 

Labeling of aRNA, hybridization and scanning were carried out in the same method as 
described in Example 1. 
15 [0073] 

(Mann- Whitney test) 

For identifying genes showing different expressions between a group of no disease and 
group of recurrence, normalized signals were analyzed by the Mann- Whitney test applied to 
a series of Xs. Here, X represents Cy5/Cy3 signal strength ratio of each gene and each 
20 sample. Genes showing a difference of 2-fold or more in expression strength between two 
groups wen? selected. Genes with signal-noise ratios of 3.0 or less were excluded from 
analysis. 

The U value was calculated for genes imparting significant signals in at least 5 
samples in both groups. Genes with U values of lower than 37 or larger than 107 were 
25 selected. Since the U value was obtained by calculation for 5Y-F group based on 5Y-R 
group in each gene based on each X value, genes with U values lower than 37 were 
evaluated tc manifest higher expression in 5Y-F group than in 5Y-R group (first category). 
On the other hand, genes with U values higher than 107 were evaluated to manifest higher 
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expression in 5Y-R group than in 5Y-F group (second category). 

Based on this method, 78 genes were identified in the first category and 55 genes were 
identified in the second category. Thus, only genes showing a difference of 2-fold or more 
of the intermediate expression value between two groups (|xXr/|iXf < 0.5 or > 2.0, jiXr and 
5 |liX f represent average X values in 5Y-R and 5Y-F group, respectively) were defined as 
genes correlated with prognosis. In total, 98 genes were selected, and of them, 64 genes 
showed higher expression level in 5Y-F tumor and 34 genes showed higher expression level 
in 5 Y-R tumor. 
[0074] 

1 0 (Random-permutation test) 

For evaluating values of genes selected by the Mann- Whitney test, a permutation test 

was carried out, and correlation to group difference (Ps) of genes selected was evaluated. 

When each gene is represented by an expression vector v(g) = (XI, X2, — , X24) (Xi shows 

a gene expression level of i-th sample in the first sample set), an idealized expression 
15 pattern is expressed by c = (cl, c2, — , c24) (ci = +1 or 0, depending on whether i-th 

sample belongs to F group or R group). 

Correlation between a gene and a group difference Pgc was defined as described below. 

That is, Pgc = (|xF+|iR)/(sF+sR); (aF((iR) and sF(sR) show standard deviation of log 2 X of 

the gene "g" of each sample in a newly defined "F" group or "R" group. 
20 The permutation test was carried out while substituting the coordinate of c. The 

correlation values, Pgcs were calculated between all permutations. These procedures were 

repeated for 10000 times. Accidentally, the p value showing a possibility of a gene for 

classifying two groups was evaluated for each of 58 genes selected. 

[0075] 

25 (Semi-quantitative RT-PCR) 

RNA (5 [ig) was treated with DNase I (Epicentre Technologies, Madison, WI, USA), 
then, single- stranded cDNAs were subjected to reverse transcription using Reverscript II 
reversetranscriptase (manufactured by Wako Pure Chemical Industries, Ltd., Osaka, Japan) 

37 



and 0.5 ug/pl oligo (dT) 12-18 primer. The preparations of single-stranded cDNAs were 
diluted for the subsequent PCR amplification by monitoring GAPDH as a quantitative 
control. All PCRs were carried out under the following reaction conditions using Gene 
Amp PCR system 9700 (Applied Biosystems, Foster City, CA, USA) at an amount of 
lxPCR buffer of 30 pi. 
94°C 2 minutes, 

(94°C 30 seconds, 58-62°C 30 seconds, and 72°C 30 seconds) for 27 to 35 cycles 
72°C 5 minutes. 

Primer sequences for RT-PCR of GAPDH are as described below: 
SEQ ID No. 43 (forward) 5'-GAAAGG TGA AGG TCG GAG T-3' 
SEQ ID No. 44 (reverse) 5'-TGG GTG GAATCATAT TGG AA-3' 
[0076] 

(Primer of semi-quantitative PCR (gene highly expressed in group of no disease)) 

[0077] 

[Table 4A] 
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Ac./HS 


SEQ ID No. 


Forward 


SEQ ID No. 


Reverse 


M90439 


45 


CCAGACATCCATGGTACCTATAA 


46 


TATGCATTGAAACCTTACAGGGG 


AF047472 


47 


CTGTTAAACAAAGCGAGGTTAAGG 


48 


GGGTTCTGCATCTCGTTTATTAG 


Hs.1 18251 


49 


GACACATAGCTCATAGGCACACA 


50 


TTCTGGTACATGGTAAGTGCTCA 


D26125 


51 


TCCGCCATATTGATTCTGCTTA 


52 


GTTTGCTTTCTGGACCATGGATA 


Hs.861 9 


53 


GATAACAACTGGACCACATCCC 


54 


AACAGGCAGACGAGGTAGACAC 


X16135 


55 


GAGAAGGATGGGTCCACCAGT 


56 


GTACATGGGCAGCACAAATGTAT 


Hs.9006 


57 


ATTTCATTGGTAGTATGGCCCAC 


58 


ATACCATGGGACAGGATTGTAAG 


M18963 


59 


GCTCAGACCAGCTCATACTTCAT 


60 


CCAAAGACTGGGGTAGGTAAAAC 


X07979 


61 


CTGGTGCTTTCTATC AC CTCTTC 


62 


GACTAGTGTGAAACAAGATGGGC 


AFO 180BO 


63 


CTTGAAC C CAGGAGTTTGAGAC 


64 


GTGCCTCAGCTTTCTGAGTAGC 


Hs 58464 


65 


CTGGTGCTGACTATC C AGTTGA 


66 


CTGGTAAACTGTCCAAAACAAGG 

VJ % VJVJ 1 ^A^V^VJ 1 VJ 1 VJ \J r\i^*^M\\J f^£^\* VJ 


S79867 


67 


CTCTTACCTGGACAAGGTGCGT 

W I VJ 1 1 / >W 1 sJ\<l/\ I VJ VJ XJJ t 


68 


GGATGAGCTCTGCTCCTTGAG 

VJVJ#^ I VJJ^VJVJ 1 V^ | VJVJ 1 VJ VJ 1 1 MHM 


JQ2854 


69 


CAATGTTTGACCAGTCCCAGA 


70 


CATGTTGTCTCAGTCCTCTATTGG 


735309 


71 


GGACAGCAGCTGGAGTACACA 


72 


A ATC A G ATTTGTCGGTGC CTT 
nn i Wr>\»m ill u i v/vinj i uww i i 


Hs 83097 


73 


GGCTCTGCACTAAGAACACAGAG 


74 


ACAACTAGCTCTCAGTTCAGGCA 


MO. / J 1 J/ 


75 


TrtfiAfiCAfYTATYiACA AfiCTAC ft A 


76 


AAGCAGCACTGCATAAACTGTTC 


Hs.4864 


77 


TA AGT A CTTTC CTGTGGGTC GCT 


78 


C C AC A AAC A G GA A GC TATGTTCT 


Y00052 


79 


GTACTATTAGCCATGGTCAACCC 


80 


CTACAGAAGGAATGATCTGGTGG 


Hs 5002 


81 


ATC A GTACGGGGAC CTTACAAAC 


82 


CCTGTACTGAGCTCTCCAAAGAC 


U435T9 


83 


TCCCTAGCTTCCTCTCCACA 


84 


AGAATCATGCCTCCCCTTCT 


Hs.94653 


85 


AC CCCTCAAGTGTAAGGAACTG 


86 


G GATCAAGAGTGTGTGTGTGTGT 

VJ VJI^V | V^/T^VV4T>VJ 1 1 VJ 1 1 VJ I VJ | VJ 1 


X51441 


87 


CAATGCCAGAGAGAATATCCAGA 


88 


GATACCCATTGTGTACCCTCTCC 


Hs.1 08623 


89 


CCACTCCACATAAGGGGTTTAG 


90 


GAGGTTCTAGCTAAGTGCAGGGT 


Hs.5318 


91 


CCATTGACATTGGAGTTAAGTATGC 


92 


GGCAAAGACCACATTTAGCAAT 


Hs.69469 


93 


GAAAGCCTATGTGAAAAGCTGGT 

VJ#T#Tjt^ MV VJ 1 / > 1 V* 1 ViiTlTrViMW 1 MM 1 


94 


TTGTTTCCAGGCATTAAGTGTG 


AA777648 


95 


GCATCTTAGTCCACACAGTTGGT 


96 


GCCCTTACAGGTGGAGTATCTTC 


Hs.1 061 31 


97 


CTCATAGCCAGCATGACTTCTUT 


98 


GGTTCACTTGTGACTGGTCATCT 


X54079 


99 


ACTTTTCTGAGCAGACGTCCAG 


100 


TATCAAAAGAACACACAGGTGGC 


A1041182 


101 


ACGTTATTCCCAGTTCCTAAACC 


102 


AGTCTCGGGTGACTCAATATGAA 


AA1 48265 


103 


AGTTGAACCCAGGTACCTTTCTC 


104 


CTAGGC CCTTTTAGAAAACATGG 


Hs.4943 


105 


TACTGGGAACGACTAAGGACTCA 


106 


TGCTGTGTTGAGTAGGTTTCTGA 


Hs.106326 


107 


TGAGAGTCCTCAGAGGGTATCAG 


108 


CTTGAAGTCAAGAGTCCTGGTGT 


M13436 


109 


TTTCTGTTGGCAAGTTGCTG 


110 


CCCTTTAAGCCCACTTCCTC 


X99920 


111 


GATGAGAAGATGAAGAGCTTGC5A 


112 


GAGGAAGCTTTATTTGGGAAGAG 


U22970 


113 


ACTTCCCTCTCTGCCTTTCTG 


_ 114 


CAGATTGTTTTGGGCTTCTCACT 



[0078] 

(Primer of semi-quantitative PCR (gene highly expressed in group of recurrence)) 
5 [0079] 
[Table 4B] 
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Ac./HS 


SEQ ID No. 


Forward 


SEQ ID No. 


Reverse 


X75252 


115 


GTCTGGTCAGCTTTGCTTCC 


116 


GGCAAGTTCTGCACAGATGA j 


AA989127 


117 


CAGCTCAGTGCACCATGAAT 


118 


GTGGGACTGAGATGCAGGAT 


Hs.1 28520 


119 


CACGGACTCATGAATGTAGTGAA 


120 


GTGTAGTGGCACGATCATAGCTT 


HSMLN50 


121 


GGGACCAAACAGACCAAAGA 


122 


CACCCCACAGAGCCTGTATT 


AF058701 


123 


CGGAAAGGCACTATTTCACAAT 


124 


ACAGGCCCACAGGTTTGTAAC 


AF043473 


125 


AAGCTCTTCAGCTGCGTCTC 


126 


CCTCCTCCTTTTCAGCTGTG 


Hs.26052 


127 


TCTGGAACCCTAAAAGTGTCGT 


128 


TCTTTCAACATCTCTCCACCCTA 


Hs.77961 


129 


AGATACCTGGAGAACGGGAAG 


130 


GGAAGTAAGAAGTTGCAGCTCAG 


Hs.26484 


131 


ATTAGGTTTCACCCAAAG 


132 


AGACGAGACTTG I I 1 1 CTC 


U44798 


133 


CAGGGACTTGGTCACAGGTT 


134 


TTCTTCTCCCTCCCCTTGAT 


Hs.77961 


135 


GATTACATCGCCCTGAACGAG 


136 


TCCATCAACCTCTCATAGCAAA 


X64707 


137 


GTAAGATCCGCAGACGTAAGG 


138 


CTGAAGTCAGCCTCTGTAACCTC 


Hs.6780 


139 


ACTGACCCCACTTCTTGTGG 


140 


ACCCTTCCCTGTTGCTGTC 


Hs.1 53428 


141 


TCAAAGTATTTAGCTGACTCGCC 


142 


TAGTCACTCCAGGTTTATGGAGG 


A1066764 


143 


GGGAACTTGAATTC GTATCC ATC 


144 


CTGAATCTCAAACCTGGAGAGTG 


cl.5994 


145 


GATCATCTTTCCTGTTCCAGAG 


146 


CTGGAAGGTTCTCAGGTCTTTA 


D67025 


147 


GTACGACCAGGCTGAGAAGC 


148 


ATCTTCGGGGCTATCCAACT 


X16064 


149 


TCAGCCACGATGAGATGTTG 


150 


TGTGGATGACAAGCAGAAGC 


M80469 


151 


ACCTTAGGAGGGCAGTTGGT 


152 


AGGGGTCACACCTTGAACAG 


E02628 


153 


GCATCCTACCACCAACTCGT 


154 


GCAGCATCACCAGACTTCAA 


HUMTHYB4 


155 


ACAAACCCGATATGGCTGAG 


156 


GCCAATGCTTGTGGAATGTA 


Hs.1 16922 


157 


TCGGACCATAATCCAAGTTACC 






x15940 


158 


TAACCCGAGAATACACCATCAAC 


159 


ATGGTTTTATTGACGGCAGAAG 



[0080] 

(Measurement of signal strength of RT-PCR product and calculation of prognosis score) 

The signal strength of the RT-PCR product was measured and evaluated in the same 
method as described in Example 1, and 10 genes with p values of 0.05 or lower in the t-test 
were selected as a candidate; of them, expression levels of 3 genes were higher in the 5y-R 
group than in the 5y-F group. The expression levels of 7 genes were higher in the 5y-F 
group than in the 5y-R group. Base on this information, the present inventors have tried to 
establish a scoring system for predicting the postoperative prognosis of node-negative breast 
cancer. 
[0081] 

For obtaining expression level to be a subject of each gene, the expression ratio (ER) 
to the GAPDH expression was calculated according to the following formula: 

ER of gene A = 16 bit imaging score of semi-quantitative PCR (strength of band 
stained with ethidium bromide) of gene A of cancer sample X/16 bit imaging score of 
GAPDH of gene A of cancer sample X 
[0082] 

(Definition of scoring system for predicting postoperative prognosis of node-negative breast 
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cancer) 

For obtaining the postoperative gene prognosis index of node-negative breast cancer, 
prognosis score (PS) was defined; (sum of normalized expression ratios of genes highly 
expressed in 5 Y-R group as compared in 5 Y-F group) - (sum of normalized expression ratios 
5 of genes highly expressed in 5Y-F group as compared in 5Y-R group) 
[0083] 

A significance of the expression ratio between two groups was evaluated by the 
Student's t-test. All statistical methods were carried out by Statview version 5.0 (SAS 
Institute, Cary, NC). 
10 [0084] 
(Result) 

Clinicopathological findings of 24 breast cancer patients whose genome-wide gene 
expressions have been investigated are summarized in Table 3. The present inventors have 
investigated the gene expression by a cDNA microarray composed of 25344 human genes, 

15 for tumors from node-negative breast cancer patients of 12 cases showing survival free of 
disease for 5 years or more after an operation (5Y-F) and node-negative breast cancer 
patients of 12 cases showing recurrence of breast cancer within 5 years after a surgical 
operation (5Y-R). The clinical backgrounds were allowed to coincide in age, tumor 
diameter, estrogen receptor and progesterone receptor, and pathology between two groups. 

20 [0085] 

The data of a cDNA microarray was analyzed by the Mann- Whitney test and the 
Random-permutation test, and genes showing different expressions between 5Y-R tumor 
and 5 Y-F tumor were identified. Through this filter, 58 genes in total were selected, and of 
them, 21 genes showed significant strong expression in 5 Y-R tumor. 37 genes showed 
25 higher expression in 5Y-F tumor. 
[0086] 

The 37 genes showed higher expression in 5 Y-F tumor as compared in 5 Y-R tumor had 
six ESTs and one virtual protein (Table 5A, a difference in expression between groups is 
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expresses as "foldchange"). 
[0087] 

(Gene with significant high expression in 5Y-F tumor as compared in 5 Y-R tumor) 
[0088] 



[Table 5A] 



Ac/HS 


kinc 


fold change 


p value 


M90439 


molecular marker (EPC-1 ) gene 


2.324 


0.001 4 


AF047472 


spleen mitotic checkpoint BUB3 (BUB3) 


2.889 


0.0021 


Hs.1 1 8251 


ESTs; 


2.121 


0.0031 


D26125 


3 alpha-hydraxysteroid/dihydrodiol dehydrogenase DD4, partial cds 


2.084 


0.0038 


Hs.861 9 


SFMsex determining region Y)-box 1 8 


3.375 


0.0041 


X1 61 35 


novel heterogeneous nuclear RNP protein, L protein 


4.839 


0.0042 


Hs.9006 


VAM; D ( vesicle-associated membrane protein)-associated protein A,33kDa 


3.807 


0.0058 


Ml 8963 


islet nf Langerhans regenerating protein (reg) 


2.022 


0.0060 


X07979 


integrin beta 1 subunit 


2.997 


0.0068 


AF01 8080 


PYRIN (MEFV) 


4.016 


0.0071 


Hs.58464 


ESTs 


5.415 


0.0079 


S79867 


type I keratin 1 6 [human, epidermal keratinocytes, mRNA Partial, 1 422 nt] 


2.254 


0.0090 


J02854 


myosin light chain (MLC-2) 


2.668 


0.0090 


Z35309 


adenylate cyclase 8 (bra in) 


2.264 


0.0094 


Hs.83097 


hypothetical protein FLJ22955 


4.979 


0.0096 


Hs.791 37 


protein-L isosparate(D-aspartate)o-metyltransferase 


2.401 


0.01 05 


Hs.4864 


ESTs 


2.043 


0.01 07 


Y00052 


Pept idyl prolyl isomerase A(cyclophilin A) 


2.966 


0.01 07 


Hs.5002 


copper chaperone for superoxide dismutase; CCS 


2.032 


0.01 1 4 


U4351 9 


dystmphin-related protein 2 (DRP2) 


2.022 


0.01 1 4 


Hs.1 06326 


ESTs 


4.733 


0.01 23 


Hs.94653 


neurochondrin(KIAA0607) 


2.08 


0.01 29 


M1 3436 


ovarian beta-A-inhibin 


2.946 


0.01 35 


X51 441 


serun amyloid A (SAA) protein partial, clone pAS3-alpha 


2.383 


0.01 55 


Hs.1 08623 


thrombospondin 2 


2.019 


0.01 74 


Hs.531 8 


ESTs 


4.38 


0.01 74 


Hs.69469 


GA17 protein 


2.279 


0.01 97 


AA777648 


peripheral myelin protein 22 


2.386 


0.0209 


Hs.1 061 31 


ESTs 


2.022 


0.021 3 


X54079 


heat shock protein HSP27 


5.637 


0.021 7 


D67025 


prote;asome (prosome, macropain) 26S subunit, non-ATPase, 3 


3.179 


0.0359 


M80469 


MHC class I HLA.-J gene 


3.572 


0.0380 


AI041 1 82 


ov776i07.x1 Soares_testis_NHT Homo sapiens cDNA clone [MAGE:1 643364 


2.321 


0.0380 


AA1 48265 


RIBOSOMAL PROTBN L21 . 


2.019 


0.0440 


Hs.4943 


Inter- Alpha-Trypsin Inhibitor Heavy Chain LIKE gene 


2.426 


0.0442 


X99920 


S1 00 calcium-binding protein A1 3 


3.326 


0.0456 


U22970 


interferon-inducible peptide (6-16) gene 


2.741 


0.0465 



[0089] 

In Table 5B, 21 genes highly expressed in the 5 Y-R group are listed. Of them, five 
genes are ESTs and one gene encodes a virtual protein. From the panel including 58 genes, 
marker for postoperative prognosis were selected according to the following standard; (1) 
Having higher signal strength than cut off level situated in at least 60% of cases; (2) |(xR-[xF| 
> 1.0. Here, |jR(|iF) shows an average value derived from logarithm converted expression 
ratio in the case of 5 Y-R or 5Y-F. 
[0090] 
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(Gene with significant high expression in 5Y-R tumor as compared in 5 Y-F tumor) 
[0091] 



[Table 5B] 



Ac./HS 


kind 


fn Irl p ha n op 


p value 


X75252 


Prostatic Rindicr nmtein 


4.506 


n nm 1 


AA9891 91 


ihcijji i iiolu i_fU i i i|jcti_i unity i_fU 1 1 1 (Jic a, uiaoo J-^^-' 


J. / ul 


u.uuuu 


Hs.1 28520 


ESTs 


1 .41 9 


0.0067 


HSMLN50 


ESTs 


3.482 


n nn7i 


AF0587Q1 


HNA tn K/mp raQp 7Pta nafak/Hn ^uhunit (RFV/O 


2.1 85 




AF043473 


dpiavpd - rpntifipr K+ nhannpl alnha ^uhunit CKHN'-sl ) Pnta^aium 

voltage -gated channel, delayed- rectifier, subfamily S, member 1 


4 786 


n m 44 


Hs.26Q52 


hypothetical protein MGC43306 


4 


u.ui ou 


Hs.77961 


major histocompatibility complex, class I, B 


5.775 


0.01 52 


Hs.26484 


HI3A interacting protein 3 


5.07 


0.01 57 


U44798 


U1-snRNP binding protein homolog (70kD) 


2.615 


0.01 94 


Hs.77961 


MHC class I HLA-Bw62 


5.775 


0.0209 


X64707 


BBC1 mRNA(ribosomal protein L1 3) 


2.758 


0.021 0 


Hs.6780 


PTK9L protein tyrosine kinase 9— like (A6-related protein) 


2.749 


0.0220 


Hs.1 53428 


Es:s 


3.164 


0.0234 


AI066764 


lectin, galactoside-binding, soluble, 1 (galectin 1 ) 


2.606 


0.0275 


cl.5994 


ESTs 


2.844 


0.0286 


x1 6064 


Tumor protein, translationally-controlled 1 


3.567 


0.0366 


E02628 


polypeptide chain elongation facto M alpha 


4.055 


0.0427 


HUMTHYB4 thymosin teta-4 


4.05 


0.0436 


Hs.1 1 6922 


ESTs 


2.538 


0.0494 


x1 5940 


ribosomal protein L31 . 


2.125 


0.0499 



[0092] 

7 genes highly expressed in 5Y-F tumor as compared in 5Y-R tumor (Hs.94653, 
M13436, Hs.5002, D67025, M80469, Hs.4864 and Hs.106326; p = 0.0018, 0.0011, 0.001, 
0.008, 0.0081, 0.0018 and 0.001; each according to Student's t-test) and 3 genes relatively 
highly expressed in 5Y-R tumor (AF058701, AI066764, and xl5940; p = 0.0351, 0.00161 
and 0.0001; each according to Student's t-test) coincided with standards, and were selected 
as a prognosis marker (Table 6). 
[0093] 

(Genes selected as prognosis marker for node-negative breast cancer) 

[0094] 

[Table 6] 
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AF058701 


DNA colvrnerase zeta catalvtic subunit (REV3) 


AI0667"34 


lectin salactoside - binding soluble 1 (salectin 1 ) 


x1 5940 


ribosomal nrotein 1 31 


Hs.946E>3 


ne u ro c ho nd ri n( KI AA06 07 ) 


M1 3436 


ovarian beta-AHnhibin 


Hs.5002 


copper chaperone for superoxide dismutasej CCS 


D67025 


prateasome (prosome, macroipain) 26S subunit, non-ATPase, 3 


MB046& 


MHC class I HU\-J gene 


Hs.4864 


ESTs 


Hs.1 06326 


ESTs 



[0095] 

Expressions of these markers were confirmed by a normalized semi-quantitative 
5 RT-PCR experiment for GAPDH expression. Fig. 3 shows results of RT-PCR of three 
marker genes highly expressed in samples from 12 patients showing recurrence of breast 
cancer (5Y-R group). Fig. 4 shows results of 7 marker genes highly expressed in the 5Y-F 
group (5 yeexs survival). The expression ratios of these 10 genes were used for definition 
of prognosis index. 
10 [0096] 

Prognosis score (PS) was defined as described below; 

PS = (sum of normalized expression ratios of 3 genes highly expressed in 5 Y-R tumor) 
- (sum of nonnalized expression ratios of 7 genes highly expressed in 5Y-F tumor) 
[0097] 

15 The prognosis scores of 24 cases investigated are summarized in Table 7 together with 

the expression ratio of each marker gene. The PS system predicted poor prognosis of cases 
Rl to R12 having prognosis scores of more thain 3. On the other hand, excellent prognosis 
was predicted for cases Fl to F12 having scores of lower than -16. The predictions 
coincided with actual clinical results of them with an accuracy of 100% (Fig. 5). The 

20 average PS of the 5 Y-R group was 9.44 and the average PS of the 5Y-F group was -28.92. 
[0098] 

(Prognosis score for recurrence of node-negative breast cancer) 
[0099] 



[Table 7] 



No. 


x15940 


AF058701 


AI066764 


Hs.5002 


Hs.94653 


M13436 


M80469 


D67025 


Hs.4864 


Hs.1 06326 


PS 


1 n 


8 90 


2 70 


8.35 


1 50 


0 82 


1 47 


2 43 


2 72 


2 60 


2 55 


5 86 


2n 


7.02 


2.1 9 


7.48 


1.14 


0.50 


1 .51 


2.32 


1 .27 


1 .89 


0.62 


7.44 


3n 


7.57 


2.36 


1 0.96 


1.40 


0.55 


2.29 


3.51 


2.38 


1 .79 


0.44 


8.53 


4n 


8.57 


2.79 


9.78 


1.75 


1.42 


2.02 


3.30 


3.03 


3.44 


3.02 


3.16 


5n 


14.96 


2.56 


18.01 


3.88 


0.53 


0.67 


3.96 


2.76 


3.78 


1 .83 


1 8.1 2 


6n 


16.94 


3.97 


12.76 


0.11 


0.73 


1 .50 


3.1 9 


2.01 


3.60 


4.41 


1 8.1 2 


7n 


14.51 


3.02 


1 1 .62 


0.37 


2.24 


2.05 


2.1 4 


1 .45 


1 .64 


2.96 


1 6.30 


3n 


9.50 


2.81 


1 0.43 


2.86 


1 .64 


1 .95 


5.40 


3.1 8 


1 .89 


1 .79 


4.03 


9n 


8.29 


2.96 


8.32 


0.78 


0.55 


1.91 


1 .50 


1 .31 


1 .40 


2.80 


9.32 


10n 


6.78 


2.06 


1 0.59 


0.39 


1 .93 


0.70 


2.49 


3.56 


1 .27 


0.84 


8.25 


1 1 n 


7.30 


1 .38 


1 0.89 


3.03 


2.82 


0.46 


2.1 8 


3.09 


2.00 


2.1 6 


3.83 


1 2n 


8.60 


3.81 


1 5.86 


3.31 


3.46 


0.70 


3.1 9 


1 .82 


2,54 


2.95 


1 0.30 


1 nR 


4.67 


0.81 


4.69 


4.13 


2.98 


3.80 


7.78 


5.34 


7.59 


8.47 


-29.92 


2nR 


4.32 


0.63 


3.88 


2.82 


2.68 


2.89 


4.51 


3.74 


4.86 


9.28 


-21 .95 


3nR 


10.54 


0.56 


7.28 


2.40 


2.06 


2.10 


8.13 


6.02 


6.02 


8.55 


-1 6.95 


4nR 


5.59 


0.56 


4.85 


3.22 


3.69 


2.89 


11.18 


3.31 


6.39 


11.36 


-31 .04 


5nR 


5.56 


0.18 


4.97 


5.57 


4.57 


1.15 


3.18 


4.85 


5.56 


12.68 


-26.85 


6nR 


4.50 


0.51 


4.01 


6.81 


2.54 


5.45 


6.61 


7.49 


7.16 


6.18 


-33.22 


7nR 


5.09 


0.97 


4.72 


3.14 


3.74 


5.57 


7.95 


3.94 


7.90 


9.71 


-31 .1 7 


8nR 


4.93 


0.54 


4.46 


7.53 


4.95 


5.93 


11.03 


1.96 


6.21 


7.75 


-35.43 


9nR 


5.25 


1.17 


5.15 


3.09 


3.39 


3.30 


10.05 


2.66 


476 


10.82 


-26.50 


10nR 


5.36 


0.59 


5.96 


3.67 


2.73 


2.47 


4.66 


3.12 


10.63 


8.27 


-23.69 


11 nR 


4.99 


1.02 


5.71 


7.48 


4.51 


6.22 


4.61 


4.28 


10.65 


9.20 


-35.23 


12nR 


4.84 


0.30 


4.98 


7.57 


6.07 


5.04 


7.05 


3.07 


7.42 


8.98 


-35.08 



[Example 3 
[0100] 

Evaluation of gene expression function for prediction of the postoperative prognosis in 
primary breast cancer 
[0101] 
(Tissue sample) 

A tissue sample was collected in the same manner as described in Example 1. 
Among 954 patients clinically traced during a period of 5 years or more or until death after 
an operation for breast cancer in a period from 1995 to 1997, 10 cases of death within 5 
years after an operation and 10 cases of survival free of disease for 5 years or more after an 
operation were selected as a sample. The clinical backgrounds between two patient groups 
were allowed to coincide as strictly as possible regarding age, metastasis to lymph node, 
tumor diameter and tissue type (Table 8). The clinical backgrounds of additional 20 cases 
used for testing the final prediction system are summarized in Table 9. 
[0102] 

(Clinical pro file of patients used for microarray analysis) 
[0103] 
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[Table 8] 



Case T N M Stage Age NI? ly b f ° ER d 



Surviv 



Dead 



MSI 


2 


1 


0 


II 


52 


4 


1 


2 


r> 

r 


MS2 


2 


2 


0 


II 


47 


2 


0 


1 


p 


d MS3 


2 


2 


0 


II 


40 


5 


0 


1 


N 


MS4 


2 


2 


0 


II 


64 


3 


0 


1 


N/A 


MD1 


2 


2 


0 


II 


47 


5 


0 


0 


P 


MD2 


2 


2 


0 


II 


34 


3 


3 


0 


N 


MD3 


2 


2 


0 


II 


66 


4 


0 


3 


N 


MD4 


2 


0 


0 


II 


71 


2 


0 


1 


P 



a) Number of lymph nodes involved. 

b) Lymph vessel invasion: 0, no cancer cells in vessels. 

3, many cancer cells in vessels. 

c) Fat invasion: 0, no invasion to fat tissue; 3, severe invasion to fat tissue. 
(I) Estrogen receptor status: P, positive; N, negative; N/A, not available. 



[0104] 

5 (Clinical pro file of patients used for RT-PCR analysis) 
[0105] 
[Table 9] 
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Case T N M Stage ly a f b 



Sui-vived 



Dead 



SI 


2 


0 


0 


ii 


0 


1 


S2 


2 


2 


0 


TI 


1 


0 




2 


2 


0 


T T 
II 


0 


2 


C A 


2 


1 


0 


TT 
II 


1 


n 


S5 


2 


2 


0 


TT 
II 


3 


2 


S6 


2 


0 


0 


II 


0 


0 


S7 


2 


1 


0 


II 


0 


0 


S8 


2 


1 


0 


II 


0 


2 


S9 


2 


1 


0 


II 


1 


2 


SlU 


2 


1 


0 


TT 
II 


0 


0 


IJ I 


z 


-i 

i 


(.) 


TT 
11 


0 


1 


D2 


2 


2 


0 


II 


0 


0 


D3 


2 


2 


0 


II 


3 


0 


D4 


2 


2 


0 


II 


0 


3 


D5 


2 


2 


0 


II 


1 


3 


D6 


2 


1 


n 


II 


0 


1 


D7 


2 


0 


0 


II 


0 


1 


D8 


2 


1 


0 


II 


0 


0 


D9 


2 


4 


0 


IV 


1 


0 


D10 


2 


1 


0 


II 


0 


2 



a) Lymph vessel invasion 

b) Fat infiltration 





Age c 


lymph node correlation 11 


Survived 


52.8 


7.6 


Dead 


56.0 


5.4 



e)Mean of age t!) Average number of lymph nodes involved 



[0106] 

(Clinicopathological parameter) 
5 The chmcopathological parameters were checked by the method described in Example 

1. 

[0107] 
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(Preparation of cDNA microarray) 

A cDNA microarray was prepared by the method described in Example 2. 
[0108] 

(RNA extrac tion and RNA amplification) 
5 RNA was extracted using TRIzol (Invitrogen, Carlsbad, CA, USA). For removing 

degenerate RNA, each extracted RNA (1 jag) was subjected to electrophoresis on 3.0% 
formaldehyde denatured gel. For removing DNA mixing, purification was carried out 
using RNeasy kit (QIAGEN, Valencia, CA). Amplification was carried out based on T7 
RNA polymerase base by Messsage Amp aRNA kit (Ambion, Austin, TX), and RNA used 
10 for microarray analysis was prepared. In the first amplification, RNA (5 |ug) was used as a 
template. Thereafter, the firstly amplified RNA (aRNA) (2 |ig) was used as a template for 
the second amplification. The amplified aRNAs were purified by RNeasy purification kit, 
and the amount of each aRNA was measured by a spectrophotometer. 
[0109] 

15 (Labeling of aRNA, hybridization and data analysis) 

A hybridization probe was produced using aRNA (5 |ig) for producing fluorescent 
probe obtained by second amplification, using Amino Allyl-cDNA labeling kit (Ambion, 
Austin, TX). Probes derived from cancer RNA and normal control RNA were labeled with 
Cy5 or Cy3 Mono-Reactive Dye (Amersham Bioscience UK Limited, Buckinghamshire, 

20 UK), respectively. 
[0110] 

For removing an unbound dye, a labeled probe was purified by QIA quick PCR 
purification kit (QIAGEN, Valencia, CA). Each 10 pmol of fluorescent labeled probes 
from tumor and normal RNA were mixed with 4x microarray hybridization buffer 
25 (Amersham (UK)) and de-ionized formamide. The probe mixture was hybridized to a 
cDNA array at 40°C for 15 hours. Thereafter, the mixture was washed with O.lx SSC 
containing 0.2% SDS once for 5 minutes, then, twice for 10 minutes. All procedures were 
carried out in Automated Slide Processor System (Amersham). The signal strength of each 
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hybridization was read by Gene Pix 4000 (Amersham), and evaluated by Gene Pix Pro 3.0 
(Axon Instruments, Inc., Foster City, CA, USA). The read signals were normalized by the 
total gene normalization method (Yang, Y.H., Dudoit, S., Luu, P., Lin, D.M., Peng, V., Ngai, 
J., and Speed, T.P. (2002). Nucleic Acids Res 30, el5.; Manos, E.J., and Jones, D.A. 
5 (2001). Cancer Res 61, 433-438). 
[0111] 

For confirming genes showing different expressions between a survival group and a 
dead group, normalized signals were analyzed by the Mann- Whitney test; the normalized 
signals were applied to a series of Xs. X represents Cy5/Cy3 signal strength ratio for each 
10 gene and each sample (Ono, K., et al. (2000). Cancer Res 60, 5007-5011). Genes showing 
a U value of 0 in the Mann- Whitney test and genes showing a difference of 2-fold or more 
in expression strength between two groups were selected. Genes with S/N ratios of less 
than 3.0 were excluded from investigation. 
[0112] 

15 (Semi-quant itative RT-PCR experiment and gene expression ratio) 

For verifying the data of a microarray, the present inventors carried out a 
semi-quantitative RT-PCR experiment by reverse- transcribing RNA (10 \xg). For adjusting 
the concentration of the transcribed cDNA, GAPDH was selected as an internal control, and 
semi-quantitative RT-PCR was carried out (Ono, K. 5 et al. (2000). Cancer Res 60, 

20 5007-5011). Primers for GAPDH were 5'-ggaaggtgaaggtcggagt-3 (Forward) and 
5-tgggtggaatcatattggaa-3 (Reverse). After adjusting the concentration of the primer, 
semi-quantitative RT-PCR was carried out on selected genes in samples from the survival 
group and the dead group. Primers for the genes (Table 10) were designed based on 
sequence information of NCBIGen Bank (http://www.ncbi.nlm.nih.gov/) and primer 3 on 

25 website (http://www.genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi). Each 
semi-quantitative RT-PCR experiment was performed using, as a template, cDNA (1 jil) 
having been adjusted concentration, 5 U TakaraEXTaq (Takara, Otsu, Japan), lx PCR buffer 
(10 mM Tris-HCl, 50 mM KC1, 1.5 mM MgCl 2 ), and 10 nM dNTPs and 10 pmol of forward 



and reverse primers, in a total amount of 30 jlxI. 
SEQ ID No. 160 ggaaggtgaaggtcggagt 
SEQ ID No. 161 tgggtggaatcatattggaa 
[0113] 

5 (Primer of semi-quantitative PCR) 
[0114] 
[Table 10] 



gene 


SEQ ID No. 


Forward 


SEQ ID No. 


Reverse 


pnip 


162 


CCTCCAACTGCTCCTACTOG 


163 


TCGAAGCCTCTGTGTCCTTT 


C1r 


164 


GAAGTTGTGGAGGGAC GTGT 


165 


GACTTCCAGCAGCTTCCATC 


DPYSL3 


166 


CATGTACTGAGCAGGCCAGA 


167 


AAGATCTTGGCAGCGTTTGT 


PTK9L 


166 


TTGTGATTGAGGACGAGCAG 


169 


AATGGTTTCCCGCTCTAGGT 


CPE 


170 


CTCCTGAGACCAAGGCTGTC 


171 


TGAAGGTCTCGGACAAATCC 


or -tubulin 


172 


GGAACGCCTGTCAGTTGATT" 


173 


CTCAAAGCAAGCATTGGTGA 


$ -tubulin 


174 


TCTGTTCGCTC AGGTC CTIIT 


175 


TGGTGTGGTCAGCTTCAGAG 


HSP 90-a 


176 


AAAAATGGCCTGAGTTAAGTGT 


177 


TC CTC AATTTC C CTGTGTTTG 


MDH 


178 


TGCACACTAACAGCATGACG 


179 


GAATTTCTTTCCTCTGCCTGA 


NDUFE3 


180 


GGGATAAACCAGACAAGTAGGC 


181 


GGACATGAGCATGGACATCA 



10 [0115] 

For evaluating the strengths of gene expressions between the survival group and the 
dead group, each semi-quantitative PCR product (8 jal) was subjected to electrophoresis on 
2.5% agarose gel, and stained with ethidium bromide. The concentration of each stained 
sample was measured by Alphalmager 3300 (Alpha Inonotech, San Leandro, CA) using 
15 background correction. For obtaining the expression level of each gene, the expression 
ratio was normalized with the expression level of GAPDH. 
[0116] 

The expression ratio was defined by the following formula: Expression ratio of gene A 
= 16 bit imaging score of semi-quantitative PCR (strength of band stained with ethidium 
20 bromide) of gene A in cancer sample X/16 bit imaging score of GAPDH in cancer sample X 
[0117] 

(Definition of prognosis index (PI) of primary breast cancer) 

The present inventors defined the prognosis index (PI) of primary breast cancer by 

subtracting the sum of normalized expression ratios of genes highly expressed in the 5D 
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group from the sum of normalized expression ratios of genes highly expressed in the 5S 

group. A significance of expression ratios between two groups was evaluated by the 

Student's t-test. Comparison of PI between the 5S group and the 5D group was carried out 

by the Marin- Whitney test. All the statistics were storaged using Statview version 5.0 

(SASInstitui:e Inc., Cary, NC). 

[0118] 

(Result) 

On a cDNA microarray composed of 18432 human genes, genome- wide gene 
expression fi inctions of tumors from 8 breast cancer patients were examined. Four patients 
survived fre(i of disease for 5 year or more after an operation (5S), and four patients died of 
breast cancer within 5 years after the operation (5D). The clinical backgrounds between 
two patient groups were allowed to coincide as strictly as possible regarding age, tumor 
diameter, metastasis to lymph node, hormone receptor condition and tissue type (Table 8). 
[0119] 

For identifying genes showing different expressions between the 5D group and the 5S 
group, the present inventors analyzed the data of the cDNA microarray by the 
Mann- Whitney test. 23 genes in total among which six genes are ESTs/virtual proteins are 
genes showing a U value of 0 in the Mann- Whitney test and highly expressed in the 5S 
group (Table 11). 
[0120] 

(Gene group highly expressed in survival group by microarray analysis) 

[0121] 

[Table 11] 
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Gene name and detail 

IMAGED 159 3' similar to gb:J04173 PHOSPHOGLYCERATE M UTASE, BRAIN FORM 

IMAGE:22798 3% !vl RNA sequence 

cDNA clone IM A CiE: 1693352 3', M RNA sequence 

CCNDBP1 cyclin D-ty pe binding-protein 1 

ESTs 

pro-alpha- 1 type 3 collagen 
complement compenem Clr 
DPYSL3 dihydropyrimidinase-likc 3 
ribosomal protein L6 

PTK.9JL protein tyrosine kinase 9-like <A6-related protein) 

Homo sapiens full lEngth insert cDNA YN8SE09 

somatostatin receptor isoform 2 (SSTR2) gene 

CPE carboxy peptidase E 

YR-29 hypothetical protein YR-29 

IMAGE:4S22062, mRNA 

KIAA1S32 protein, partial cds 

CREG cellular repressor of El A -stimulated genes 

Homo sapiens putaiive splice factor iransformer2-beta mRNA, complete cds 
Human N-ucety 1-be a-gjucosarmnidase (HEXB) mRNA, 3' end 
Human cytochrome b5 mRNA, complete cds 

Human pS2 mRNA induced by estrogen from human breast cancer cell line M CF-7 

Human alp ha-t abulia mRNA, complete cds 

Homo sapiens done 24703 beta-tubulin mRNA, complete cds 



Accession Number 


Fold change 


R51864 


4.3114 


R39171 


2.918 


Al 14(1851 


2.891 


AF0S2569 


3.202 


AI446435 


3.251 


X14420.1 


3.394 


J04O8O.1 


3.396 


D78014 


3.625 


X69391.1 


3.807 


Yl 71 69.1 


4.143 


AF075050.1 


4.257 


MS1830.1 


5,475 


NM 0O1S73.1 


5.807 


AJ012409.1 


6.333 


BC034SH 


6.373 


AB058735.1 


13.352 


AFOS4523.1 


2.739 


U61267.1 


2.55 


M 13519.1 


2.698 


M 22865.1 


2.SS1 


X00474.1 


2.702 


K00558 


4.655 


AF070561.1 


3.917 



[0122] 

Table 12 describes 21 genes highly expressed in general in the 5D tumor, including 6 
ESTs/virtual proteins, and having a U value of 0 in the Mann- Whitney test. In the table, a 
difference in gene expression between two groups is shown as "foldchange". 
[0123] 

(Gene group highly expressed in dead group by microarray analysis) 

[0124] 

[Table 12] 



52 



Gene name and detail 


Accession Number 


Fold change 


Lyam-1 mRNA for leukocyte adhesion molecule- 1 


X16150.1 


7.459 


APM 2 adipose specific 2 


NM_006S29.1 


4.S53 


DNA polymerase gamma mRNA. nuclear gene encoding mitochondrial protein 


U 60325.1 


4.269 


FU22128 fis, clone HEP19543 


AK0257S1 


4.109 


actin related protein 2/3 complex, subunil 4, 20kDa (ARPC4) 


NM_0057J8.2 


4.058 


Scd mRNA forstearoyl-CoA desaturase 


AB032261.1 


3.794 


novel heterogeneous nuclear RNP protein, L protein 


X16135.1 


3.771 


ENSA endosulfine alpha 


AF1 57509.1 


3.511 


1MACE:26483 5' similar to gbtXISI S3_cdsl HEAT SHOCK PROTEIN HSP 90-ALPHA 


R J 2732 


3.086 


malonyl-CoA decarboxylase (MLYCD) 


NM_012213 


3.067 


anion exchanger 3 brain isoform (bAE3) 


U05596.1 


2.889 


IMAGE:43550 3 , MRNA sequence 


H05914 


2.345 


cDNA FLJ2363C fis, clone CAS07176. 


AK0742I6 


2.426 


1MAGE:26366 3' similar to cb:D16234 PROBABLE PROTEIN DISULFIDE 1 -ISO ME RASE ER-60 
PRECURSOR 


R20554 


2.519 


Similar to hypothetical protein PR02S31, clone MGC:23S13 IMAGE:4273S37, mRNA, complete cds 


BCO 17905.1 


2.551 


FLJ 40629 hypothetical protein FU 40629 


AK097948.1 


2.417 


ribosomal protein L29 (humrpuv) mKfv/\, comply -J^ 


U1024S.1 


2.203 


EST, clone IM AGE:745452, 3'end 


AA625S69 


2.591 


K1AA1554 KIAA1554 protein 


AB046774.1 


2.544 


IMAGE:53316 3' iimilar to SP:MDHC_MOUSE PI 4152 MALATE DEHYDROGENASE, 
CYTOPLASMIC 


R15814 


2.867 


NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3 (12kD, Bl 2), clone MGO9039 IMAGE:3S81592 


BC018183 


4.972 



[0125] 

From 23 genes highly expressed in the 5S group and 21 gene highly expressed in the 
5D group, prediction markers for postoperative prognosis were selected according to the 
following standards; (1) In microarray analysis, a difference in the signal strength between 
5S and 5D is larger than 2.0-fold in all cases; (2) The signal strength differs significantly 
between 5S and 5D in semi-quantitative PCR (p value < 0.05 in Student's t-test); (3) The 
result of semi-quantitative PCR was re-confirmed by independent triple experiments. 7 
genes highly expressed in the 5S tumor and 3 genes highly expressed in the 5D tumor 
satisfied these standards for selecting a prognosis marker. 
[0126] 

7 genes highly expressed in the 5 S group are constituted of genes encoding 
pro-alpha- 1 type 3 collagen (PIIIP), complement component Clr, dihydropyrimidinase-like 
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3 (DPYSL3), proteintyrosinekinase 9-like (PTK9L), carboxy peptidase E (CPE), a-tubulin 
and p-tubul in. The p values in the Student's t-test of these marker genes were 0.00039, 
0.0012, 0.0042, 0.036, 0.039, 0.034 and 0.00069, respectively. 
[0127] 

5 3 marker genes highly expressed in the 5D group encoded heat shock protein HSP 

90-alpha gene, malatedehydrogenase, and NADH dehydrogenase (ubiquinone) 1 beta 
subcomplex 3 (NDUFB3). The p values in the Student's t-test of these genes were 0.05, 
0.0055 and 0.011, respectively. 
[0128] 

10 The present inventors normalized the experiment results of semi-quantitative RT-PCR 

by GAPDH as an internal control and evaluated the results, verifying selection of marker 
genes. 
[0129] 

The present inventors carried out semi-quantitative PCR for checking additional 20 
15 cases randomly selected. 10 of these patients died of breast cancer within 5 years after an 
operation, and remaining 10 patients survived free of disease for 5 years or more after the 
operation. Fig. 7 shows the results of RT-PCR of 7 marker genes highly expressed in the 
5S tumor. Fig. 8 shows the results of RT-PCR of 3 marker genes highly expressed in the 
5D tumor. 
20 [0130] 

The prssent inventors defined the prognosis index (PI) as described below: (sum of 
normalized expression ratios of genes highly expressed in 5S group) - (sum of normalized 
expression ratios of genes highly expressed in 5D group). The expression ratios of the 
selected marker genes are summarized together with prognosis indices for further test 
25 examples in Table 13. 
[0131] 

(Expression ratio of gene and prognosis index) 
[0132] 



[Table 13] 



Gene highly expressed in 5S Gene highly expressed in 5D 





PI1IP 


Clr 


DPYSL3 


PTK9L 


CPE 


A-tubulin 


B-tubuIin 


HSP 90 


MDH 


NDTJFB3 


Sum of S 


Sura of D 


PI + 


SI 


1.3 


4.0 


2.1 


3.3 


2.4 


0.8 


2.5 


1.5 


0.2 


1.5 


16.9 


3.2 


13.7 


S2 


5.7 


3.5 


3.3 


3.4 


6.0 


1.2 


5.1 


2.2 


0.6 


2.2 


28.1 


5.0 


23.1 


S3 


3.1 


5.8 


2.2 


3.4 


8.1 


1.8 


5.3 


1.5 


0.4 


1.5 


29.7 


3.5 


26.2 


S4^ 


7.1 


10.2 


8.6 


6.0 


16.0 


4.1 


8.0 


3.4 


4.8 


3.4 


60.0 


11.5 


48.5 


S5 


6.8 


7.4 


7.2 


6.9 


11.2 


2.7 


7.0 


5.5 


3.6 


53 


49.1 


14.6 


34.5 


S6 


4.0 


42 


1.7 


2 2 


3.6 


0.9 


6.0 


2.9 


0.9 


2.9 


22.7 


6.6 


16.1 


S7 






i i 


i £ 


n ft 


0.7 


3.4 


0.4 


0.3 


0.4 


13.7 


1.1 


12.6 


ss 


33 


3.6 


1.1 


0.7 


0.8 


1.3 


.5.0 


2.3 


1.4 


23 


15.9 


6.0 


9.8 


S9 


3.1 


3.9 


2.7 


3.7 


2.9 


1.6 


4.1 


1.0 


1.2 


1.0 


21.9 


3.2 


18.8 


S10 


2.9 


3.0 


0.9 


1.5 


1.2 


1.0 


1.7 


1.3 


0.4 


1.3 


12.2 


3.0 


9.2 


Dl 


0.1 


2.9 


0.4 


1.9 


2.9 


0.7 


0.8 


3.4 


3.0 


3.4 


9.6 


9.7 


•0.1 


D2 


0.2 


0.6 


0.1 


0.2 


0.8 


0.2 


0.8 


1.0 


4.9 


1.0 


2.9 


7.0 


-4.1 


D3 


0.2 


3.7 


0.2 


1.0 


0.6 


0.6 


2.8 


3.6 


6.6 


3.6 


9.0 


13.8 


-4.8 


D4 


0.2 


1.4 


0.4 


0.9 


1.0 


0.5 


1.7 


3.5 


3.6 


3.5 


6.1 


10.7 


-4.6 


D5 


0.1 


1.3 


0.1 


0.9 


0.6 


0.5 


1.0 


3.2 


0.3 


3.2 


4.5 


6.7 


-2.2 


D6 


2.2 


25 


1.2 


1.9 


2.0 


0.5 


1.7 


3.8 


3.5 


4.2 


12.0 


11.5 


0.5 


D7 


2.2 


Zl 


0.9 


1.9 


2.4 


0.3 


1.6 


1.9 


1.4 


2.0 


11.5 


5.3 


6.2 


D8 


1.6 


2.7 


1.1 


2.6 


1.8 


0.4 


1.8 


3.4 


2.8 


3.4 


12.0 


9.6 


2.5 


09 


1.2 


1.4 


0.6 


1.6 


1.2 


0.6 


2.4 


2.2 


0.7 


Z2 


9.2 


5.0 


4.1 


D10 


0.5 


0.8 


0.4 


0.6 


0.4 


0.4 


1.3 


3.6 


1.6 


3.6 


4.5 


8.9 


-4.4 



* Sum of ER of PUP, Clr, DPYSL3, CPE, a and P-tubulin 
** Sum of ER of HSP 90, MDH and NDUFP3 
+PI: Sum of S- sum ofD 



[0133] 

PI predicted correctly the actual clinical results of higher prognosis indices (> 7) of 10 
cases (SI to S10) in total in the 5S group and prognosis indices (< 7) of 10 cases (Dl to 
D10) in total in the 5D group. PI of the 5S group was 21.2. PI of the 5D group was -0.7. 
Here, by a PI value of 7 the 5 S tumor and the 5D tumor were apparently distinguished (p = 
0.0002). 

[INDUSTRIAL APPLICABILITY] 
[0134] 

The postoperative prognosis prediction system of the present invention is effective for 
prediction of postoperative risk of a breast cancer patient. Further, the wide-range gene 
expression list of breast cancer correlated genes of the present invention can provide various 
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information on progress of breast cancer, and a latent target molecule for breast cancer 
therapy was be predicted by the list. 
[BRIEF DESCRIPTION OF DRAWINGS] 
[0135] 

5 [Fig. 1] Fig. 1 shows a gene group (A) manifesting increase and a gene group (B) 
manifesting decrease in expression in 5y-D group as compared with 5y-S group. 
[Fig. 2] Fig. 2 shows analysis results of semi-quantitative RT-PCR of RNAs derived from 
5y-S group and 5y-D group. 

[Fig. 3] F ig. 3 shows prognosis scores in individual patients. 
10 [Fig. 4] F ig. 4 shows analysis results of semi-quantitative RT-PCR of RNAs derived from 
5Y-F group and 5Y-R group. 

[Fig. 5] Fig. 5 shows analysis results of semi-quantitative RT-PCR of RNAs derived from 
5Y-F group and 5Y-R group. 

[Fig. 6] Fig. 6 shows prognosis scores in individual patients. 
15 [Fig. 7] Fig. 7 shows analysis results of semi-quantitative PCR of 7 genes highly 
expressed in 5S tumor. 
M: marker ladder 

S1-S10: newly inspected tissues of patients survived free of disease for 5 years or 
more after operation. 

20 Dl-DK): newly inspected cases of patients died of breast cancer within 5 years after 

operation. 

Difference in expression strength was evaluated by Student's t-test; when p value is 
0.05 or less, statistical significance is believed to be present. 

[Fig. 8] Fig. 8 shows analysis results of semi-quantitative PCR of 3 genes highly 
25 expressed in 5D group. For explanation of marks, please see explanation in Fig. 7. 

[Fig. 9] Fig. 9 shows results illustrating prognosis indices (PI) of newly inspected 20 
cases. The indices of all 10 patients survived free of disease for 5 years or more were 
higher than 7. On the other hand, the indices of patients died of breast cancer within 5 
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years after operation were lower than 7. Distribution of two groups is statistically 
significant (p = 0.0002). 
[Sequence Listing] 
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[Title of Document] ABSTRACT 

[Problem] It is intended to provide a system of predicting the postoperative prognosis in a 
patient with breast cancer from the viewpoint of gene expression based on the data obtained 
by genome-wide and comprehensive analysis on gene expression in breast cancer. 
5 [Solution] Expression of human genes is comprehensively analyzed by using a DNA 

microarray and gene expression functions in various breast cancer conditions are compared, 
thereby establishing a system of predicting the postoperative prognosis of breast cancer. 
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[Title of Document] Drawings 
[Fig- 1] 
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[Fig. 3] 
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[Fig. 5] 




[Fig. 6] 
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[Fig. 9] 
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SEQUENCE LISTING 



<110> Nippon Medical School, 

Mitsubishi Kagaku Bio-CI inical Laboratories, Inc., 
Mitsubishi Rayon Co. , Ltd. 

<120> Genes involved in predicting postoperative prognosis breast cancers 

<130> NP04-1001 

<160> 181 

<170> Patent In version 3. 1 

<210> 1 

<211> 19 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 1 

ggaaggtgaa ggtcggagt 19 

<210> 2 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 2 

tgggtggaat calattggaa 
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<210> 3 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 3 

acacttcatc tgctccctca tag 

<210> 4 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 4 

ctgcctagac ctgaggactg tag 

<210> 5 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 5 

actgaggcct tttggtagtc g 



<210> 6 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 6 

tctctttatt gtgatgctca gtgg 

<210> 7 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 7 

aaatccttct cgtgtgttga ctg 

<210> 8 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 8 

cagtcatgag ggctaaaaac tga 



<210> 9 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 9 

gaagacaaca agttttaccg gg 

<210> 10 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 10 

atggttttat tgacggcaga ag 

<210> 11 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 11 

aggacacgtc ctctcctctc tc 



<210> 12 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 12 

taaagctagc gaaggaacgt aca 

<210> 13 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 13 

tcccttctgt ttcctcagtg tt 



<210> 14 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 14 

cctgccccga taaaaatatc tac 



<210> 15 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 15 

ttgaccttaa gcctcttttc etc 

<210> 16 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 16 

ataaegtaca ttcccatgac acc 

<210> 17 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 17 

actttcaaga tgggaccaag g 



<210> 18 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 18 

atatacacag aagcatgacg cag 

<210> 19 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 19 

ttgctggact ctgaaatatc cc 

<210> 20 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 20 

ttcccctgta cagtatttca ctca 



<210> 21 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 21 

ctgagcaatc tuctctatcc tct 

<210> 22 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 22 

gttccagatt cgtgagaatg act 

<210> 23 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 23 

accagtaaca actgtgggat gg 



<210> 24 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 24 

caaatgagct acaacacaca agg 

<210> 25 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 25 

ccccctccac cttgtacata at 

<210> 26 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 26 

gttttcgttt ggctggttgt g 



<210> 27 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 27 

gtctgagatt ttactgcacc g 

<210> 28 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthssized 
<400> 28 

attgctaagg ataagtgctg etc 

<210> 29 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 29 

tgtcagtata gaagcctgtg ggt 



<210> 30 

<211> 23 

<212> ONA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 30 

ttcttaggcc atcccttttc tac 

<210> 31 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 31 

gcatctgaat glctttctcc eta 

<210> 32 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 32 

ccataggatc Itgactccaa cag 



<210> 33 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 33 

actgggagtg gaggaaatta gag 

<210> 34 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 34 

ctaatgtaag ctccattggg atg 

<210> 35 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 35 

caaactgcaa aclagctccc taa 



<210> 36 

<211> 23 

<212> DNA 

<213> Artif cial 

<220> 

<223> synthesized 

<400> 36 

aggtaaccca aagtgacaaa cct 

<210> 37 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 37 

aagactaaga gggaaaatgt ggg 

<210> 38 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 38 

aggtaaccca aagtgacaaa cct 



<210> 39 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 39 

ttaagtgagt ctccttggct gag 

<210> 40 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 40 

agggccccta tatccaatac eta 

<210> 41 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 41 

agtcattcag aaiccattga gac 



<210> 42 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 42 

tgggtggaat catattggaa 

<210> 43 

<211> 18 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 43 

gaaaggtgaa ggtcggagt 

<210> 44 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 44 

tgggtggaat catattggaa 



<210> 45 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 45 

ccagacatcc atggtaccta taa 

<210> 46 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 46 

tatgcattga aaccttacag ggg 

<210> 47 

<211> 24 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 47 

ctgttaaaca aagcgaggtt aagg 



<210> 48 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 48 

gggttctgca tctcgtttat tag 

<21£» 49 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthosized 

<400> 49 

gacacatagc tcataggcac aca 

<210> 50 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 50 

ttctggtaca t«gtaagtgc tea 



<210> 51 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 51 

tccgccatat tgattctgct ta 

<210> 52 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 52 

gtttgctttc tggaccatgg ata 

<210> 53 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 53 

gataacaact ggaccacatc cc 



<210> 54 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 54 

aacaggcaga cgaggtagac ac 

<210> 55 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 55 

gagaaggatg ggtccaccag t 

<210> 56 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 56 

gtacatgggc agcacaaatg tat 



<210> 57 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 57 

atttcattgg tagtatggcc cac 

<210> 58 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 58 

ataccatggg acaggattgt aag 

<210> 59 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 59 

gctcagacca gctcatactt cat 



<210> 60 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 60 

ccaaagactg gggtaggtaa aac 

<210> 61 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 61 

ctggtgcttt ctntcacctc ttc 

<210> 62 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 62 

gactagtgtg aaacaagatg ggc 



<210> 63 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 63 

cttgaaccca ggagtttgag ac 

<210> 64 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 64 

gtgcctcagc ttcctgagta gc 

<210> 65 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 65 

ctggtgctga ctatccagtt ga 



<210> 66 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 66 

ctggtaaact gtccaaaaca agg 

<210> 67 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 67 

ctcttacctg gacnaggtgc gt 

<210> 68 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 68 

ggatgagctc tgcxcttga g 



<210> 69 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 69 

caatgtttga ccagtcccag a 

<210> 70 

<211> 24 

<212> DNA 

<213> Artif cial 

<220> 

<223> synthosized 

<400> 70 

catgttgtct cagtcctcta ttgg 

<210> 71 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthosized 



<400> 71 

ggacagcagc tggagtacac a 



<210> 72 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthes i zed 

<400> 72 

aatcagattt gtcggtgcct t 

<210> 73 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 73 

ggctctgcac taagaacaca gag 

<210> 74 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 74 

acaactagct ctcagttcag gca 



<210> 75 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 75 

tggagcagta tgacaagcta caa 

<210> 76 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 76 

aagcagcact gcitaaactg ttc 

<210> 77 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 77 

taagtacttt cctgtgggtc get 



<210> 78 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 78 

ccacaaacag gaagctatgt tct 

<210> 79 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 79 

gtactattag ccatggtcaa ccc 

<210> 80 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 80 

ctacagaagg aatgatctgg tgg 



<210> 81 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 81 

atcagtacgg ggixcttaca aac 

<210> 82 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 82 

cctgtactga gctctccaaa gac 

<210> 83 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 83 

tccctagctt cctctccaca 



<210> 84 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 84 

agaatcatgc etc cccttct 

<210> 85 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 85 

acccctcaag tglaaggaac tg 

<210> 86 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 86 

ggatcaagag tgtgtgtgtg tgt 



<210> 87 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 87 

caatgccaga gagaatatcc aga 

<210> 88 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 88 

gatacccatt gtgtaccctc tec 

<210> 89 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 89 

ccactccaca taaggggttt ag 



<210> 90 
<211> 23 
<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 90 

gaggttctag ctaagtgcag ggt 

<210> 91 

<211> 25 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 91 

ccattgacat tggagttaag tatgc 

<210> 92 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthes zed 



<400> 92 

ggcaaagacc acatttagca at 



<210> 93 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 93 

gaaagcctat gtg;aaaagct ggt 

<210> 94 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 94 

ttgtttccag gcattaagtg tg 

<210> 95 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 95 

gcatcttagt ccauacagtt ggt 



<210> 96 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 96 

gcccttacag gtjgagtatc ttc 

<210> 97 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 97 

ctcatagcca gcatgacttc ttt 

<210> 98 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 98 

ggttcacttg tgactggtca tct 



<210> 99 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 99 

acttttctga gcagacgtcc ag 

<210> 100 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 100 

tatcaaaaga acacacaggt ggc 

<210> 101 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 101 

acgttattcc cagttcctaa acc 



<210> 102 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 102 

agtctcgggt gactcaatat gaa 

<210> 103 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 103 

agttgaaccc aggtaccttt etc 

<210> 104 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 104 

ctaggccctt ttagaaaaca tgg 



<210> 105 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 105 

tactgggaac ga:taaggac tea 

<210> 106 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 106 

tgctgtgttg agtaggtttc tga 

<210> 107 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 107 

tgagagtcct cagagggtat cag 



<210> 108 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 108 

cttgaagtca agagtcctgg tgt 

<210> 109 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 109 

tttctgttgg caagttgctg 

<210> 110 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 110 

ccctttaagc ccacttcctc 



<210> 111 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 111 

gatgagaaga tgaagagctt gga 

<210> 112 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 112 

gaggaagctt tatttgggaa gag 

<210> 113 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized. 



<400> 113 

acttccctct ctgcctttct g 



<210> 114 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 114 

cagattgttt tgggcttctc act 

<210> 115 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 115 

gtctggtcag ctttgcttcc 

<210> 116 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 116 

ggcaagttct gcacagatga 



<210> 117 

<211> 20 

<212> ONA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 117 

cagctcagtg caccatgaat 

<210> 118 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 118 

gtgggactga gatgcaggat 

<210> 119 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 119 

cacggactca tgaatgtagt gaa 



<210> 120 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 120 

gtgtagtggc acgatcatag ctt 

<210> 121 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 121 

gggaccaaac agaccaaaga 



<210> 122 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 122 

caccccacag agcctgtatt 



<210> 123 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 123 

cggaaaggca ctiitttcaca at 

<210> 124 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 
<400> 124 

acaggcccac aggtttgtaa c 

<210> 125 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 125 

aagctcttca gctgcgtctc 



<210> 126 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 126 

cctcctcctt ttcagctgtg 

<210> 127 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 127 

tctggaaccc taaaagtgtc gt 

<210> 128 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 128 

tctttcaaca tctctccacc eta 



<210> 129 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 129 

agatacctgg agjacgggaa g 

<210> 130 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 130 

ggaagtaaga ag:tgcagct cag 

<210> 131 

<211> 18 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 131 
attaggtttc acocaaag 



<210> 132 

<211> 19 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 132 

agacgagact tgttttctc 

<210> 133 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 133 

cagggacttg gtcacaggtt 

<210> 134 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 134 

ttcttctccc tccccttgat 



<210> 135 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 135 

gattacatcg ccctgaacga g 

<210> 136 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 136 

tccatcaacc tctcatagca aa 

<210> 137 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 137 

gtaagatccg cagacgtaag g 



<210> 138 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 138 

ctgaagtcag cctctgtaac etc 

<210> 139 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 139 

actgacccca cttcttgtgg 

<210> 140 

<21 1> 19 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 140 
acccttccct gttgctgtc 



<210> 141 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 141 

tcaaagtatt tagctgactc gcc 

<210> 142 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 142 

tagtcactcc a^gtttatgg agg 

<210> 143 

<211> 23 

<212> DNA 

<213> Artif cial 

<220> 

<223> synthesized 



<400> 143 

gggaacttga attcgtatcc ate 



<210> 144 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 144 

ctgaatctca aacctggaga gtg 

<210> 145 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 145 

gatcatcttt cctgttccag ag 

<210> 146 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 146 

ctggaaggtt ctcaggtctt ta 



<210> 147 

<211> 20 

<212> ONA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 147 

gtacgaccag gotgagaagc 

<210> 148 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 148 

atettegggg ctatccaact 

<210> 149 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 149 

tcagccacga igagatgttc 



<210> 150 

<211> 20 

<212> DNA 

<213> Artif cial 

<220> 

<223> synthesized 

<400> 150 

tgtggatgac aagcagaagc 

<210> 151 

<211> 20 

<212> DNA 

<213> Artif cial 

<220> 

<223> synthesized 

<400> 151 

accttaggag ggcagttggt 

<210> 152 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthosized 



<400> 152 

aggggtcaca ccttgaacag 



<210> 153 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 153 

gcatcctacc accaactcgt 

<210> 154 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 154 

gcagcatcac Ccgacttcaa 

<210> 155 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 155 

acaaacccga tatggctgag 



<210> 156 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 156 

gccaatgctt gtggaatgta 

<210> 157 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 157 

tcggaccata atccaagtta cc 

<210> 158 

<211> 23 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 158 

taacccgaga atacaccatc aac 



<210> 159 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 159 

atggttttat tgacggcaga ag 

<210> 160 

<211> 19 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 160 

ggaaggtgaa ggtcggagt 

<210> 161 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 161 

tgggtggaat catattggaa 



<210> 162 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 162 

cctccaactg ctcctactcg 

<210> 163 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 163 

tcgaagcctc tgtgtccttt 

<210> 164 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 164 

gaagttgtgg agggacgtgt 



<210> 165 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 165 

gacttccagc agcttccatc 

<210> 166 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 166 

catgtactga gcaggccaga 

<210> 167 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 167 

aagatcttgg cagcgtttgt 



<210> 168 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 168 

ttgtgattga ggcicgagcag 

<210> 169 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 169 

aatggtttcc cgctctaggt 

<210> 170 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 170 

ctcctgagac cae.ggctgtc 



<210> 171 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 171 

tgaaggtctc ggacaaatcc 

<210> 172 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 172 

ggaacgcctg tcagttgatt 

<210> 173 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 173 

ctcaaagcaa gcattggtga 



<210> 174 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 174 

tctgttcgct caggtccttt 

<210> 175 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 175 

tggtgtggtc agcttcagag 

<210> 176 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 176 

aaaaatggcc tgagttaagt gt 



<210> 177 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 177 

tcctcaattt ccctgtgttt g 

<210> 178 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 178 

tgcacactaa cagcatgacg 

<210> 179 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 179 

gaatttcttt cctctgcctg a 



<210> 180 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 

<400> 180 

gggataaacc agacaagtag gc 



<210> 181 

<211> 20 

<212> DNA 

<213> Artificial 

<220> 

<223> synthesized 



<400> 181 

ggacatgagc ;.tggacatca 



